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3 The  notion  of  phases  offers  a  framework  for  understanding  compiled  strongly  typed  languages,  and 
works  toward  an  improved,  strongly  typed  language  basis  for  reusable  software.  The  research  shows 
how  types  can  be  manipulated  as  first-class  values,  and  notions  of  compiletime  and  runtime  can  be 
unified,  without  sacrificing  strong  typing  (compiletime  type  checking)  or  runtime  speed.  Type 
checking  and  expression  evaluation  are  performed  using  the  same  evaluation  mechanism. 

The  apparent  conflict  of  allowing  types  as  first-class  values,  yet  enforcing  compiletime  type  checking, 
is  resolved  by  the  notion  of  multiple  phases:  though  types  may  be  manipulated  as  first-class  values 
during  one  phase,  the  computed  type  values  become  invariants  for  the  next  phase.  ^ 

We  demonstrate  the  notion  of  phases  by  defining  a  sample  source  language.  Phi,  which  looks  like  a 
typed  lambda  calculus;  an  object  language,  IL.  which  is  syntactically  similiar  to  an  untyped  lambda 
calculus,  but  is  strongly  typed;  an  associated  IL  Machine  that  interprets  IL  programs;  and  a  translator 
for  converting  Phi  programs  to  IL  programs.  Strong  typing  is  guaranteed  in  spite  of  the  fact  that  the 
Phi  translator  does  no  type  checking.  We  also  discuss  how  phases  might  be  used  to  efficiently 
perform  partial  evaluation. 

A  phase.  /.  is  the  execution  of  an  IL  program,  p..  The  result  may  be  another  IL  program  p  _  r  to  be 
executed  in  phase  i  +  /.  or  it  may  be  the  desired  final  answer.  Phase  /  acts  as  compile,  me  for  phase 
i*l.  doing  all  type  checking  necessary  to  guarantee  that  program  pu  1  is  free  if  runtime  type  errors. 
During  phase  /,  program  p(.  can  manipulate  types  as  first-class  values:  in  general  these  computed 
types  will  be  invariants  of  the  next  phase. 
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Chapter  1 
Introduction 


1.1.  Research  Contribution 

This  dissertation  describes  a  programming  language  notion  --  phases  --  and  an  associated 
programming  method.  Its  contribution  is  both  practical  and  academic:  it  takes  a  small  step  toward 
providing  a  strongly  typed  language  basis  for  more  reusable  software:  and  it  provides  a  more  general, 
unified  view  of  certain  notions  in  programming  languages  and  methodology,  including  compiletime 
and  runtime. 

This  work  is  not  advocating  any  particular  programming  language  or  method.  The  main  intent  of 
this  dissertation  is  to  expose  the  essence  of  multiple  strongly  typed  evaluation  phases,  without 
encumbering  the  reader  with  extraneous  details  or  tangential  issues.  We  illustrate  the  essential  ideas  by 
defining  some  pedagogical  languages  based  on  the  Lambda  Calculus  [Barendregt  84]:  Phi  and  1L.  As 
of  this  writing,  two  versions  of  these  languages  have  been  implemented  and  tested. 

1.1.1.  Ideas  in  This  Research 

It  is  often  difficult,  in  reading  research  reports,  to  distill  the  important  ideas  being  advocated  from 
the  mundane  details  of  the  particular  system  described.  Outlined  below  are  what  the  author  considers 
to  be  the  most  interesting  ideas  embodied  in  this  research. 

1. 1.1.1.  An  Abstract  Data  Type  for  Type-checked  Program  Fragments 

A  particular  abstract  data  type  (the  data  type  ERT)  is  defined  for  constructing  and  manipulating 
type-checked  programs.  This  allows  program  fragments  to  be  securely  manipulated  as  data,  and  thus 
allows  compiletime  operations  to  be  treated  in  the  same  manner  as  runtime  operations.  The  primiuve 
operations  implementing  this  data  type  ensure  that  every  program  constructed  in  this  wa\  is 
syntactically  correct  and  strongly  typed.  (Section  2.2.1.) 
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1.1. 1.2.  One  Machine  Acts  as  Compiler  and  Runtime  Machine 

Notions  of  compiletime  and  runtime  are  unified:  compiletime  operations  are  generalized  and 
become  a  superset  of  runtime  operations.  A  single  abstract  machine  can  do  both  efficiently.  (Section 

3.1.2. ) 

1. 1.1.3.  Multiple  Strongly  Typed  Evaluation  Phases 

Each  phase  is  the  execution  of  a  program  on  the  abstract  machine.  The  result  of  each  phase  may  be 
the  final  answer  or  another  type-checked  program.  Each  phase  type  checks  and  generates  the  program 
for  the  next  phase.  Types  may  be  manipulated  as  first-class  values  during  any  phase:  they  become 
invariants  for  the  next  phase.  "Compiletime"  and  "runtime"  thus  become  relative  terms.  (Sections  3.2 
and  3.3.) 

1. 1.1.4.  The  Translator  Does  No  Type  Checking 

Given  a  program  in  the  source  language,  the  translator  can  produce  a  strongly  typed  program  in  the 
implementation  language  without  doing  any  type  checking.  That  is.  the  translator  does  no  type 
checking,  but  the  resulting  program  is  guaranteed  free  of  runtime  type  errors,  This  fact  may  at  first 
sound  contradictory:  it  is  explained  in  Section  3.3.3. 

1.1. 1.5.  One  Machine  Does  Partial  and  Full  Evaluation 

A  single  abstract  machine  can  efficiently  do  both  partial  evaluation  and  full  e\aluauon.  (Chapter  5.) 

1.1. 1.6.  Phase  Compilation 

Chapter  5  discusses  how  phases  might  be  used  for  partial  e\aluauon  for  a  strong!)  typed  language. 
Partial  evaluation  is  often  slow,  but  it  might  be  made  more  efficient  b\  using  two  steps:  phase 
compilation  and  phase  evaluation. 

Gnen  a  list  of  the  free  variables  to  be  gnen  fixed  values.  a  phase  compiler  would  prepare  a  program 
for  phase  evaluation,  which  will  achieve  lire  effect  of  efficient  parual/full  evaluation.  The  program 
would  first  be  "phase  compiled."  using  a  list  of  the  free  variables  --  and  their  npes  --  to  be  mstanuated. 
Efficient  partial/full  evaluation  would  then  be  performed  by  executing  this  "phase  compiled"  program 
using  phase  evaluation.  (Section  5.4.) 


1.1. 1.7.  An  Unusual  View  of  Abstract  Data  Types 

Since  types  and  code  are  first-class  values,  our  view  of  Abstract  Data  Types  (ADTs)  is  in  terms  of 
what  primitive  functions  are  necessary  in  order  to  support  user-defined  ADTs.  Operationally,  one 
needs  these  functions  in  order  to  convert  between  the  domains  of  the  abstraction  and  the 
representation.  However,  they  can  be  ordinary  functions  rather  than  special  language  constructs. 
(Section  6.1.) 

1.2.  Background  to  This  Research 

Programming  language  experts  should  read  the  definitions  of  "runtime  type  errors"  and  "strong 
typing"  in  Sections  1.2.1.1  and  1.2.1.2,  but  may  otherwise  wish  to  skip  to  Section  1.3,  which  describes 
this  research. 

1.2.1.  Two  Models  of  Program  Evaluation:  Interpreted  and  Compiled 

Figure  1-1  shows  two  models  of  how  a  source  program  written  in  some  language  L  might  be 
evaluated. 

In  the  interpreted  case,  the  program  is  given  directly  to  an  interpreter.  A  program  generally  also 
needs  a  specified  environment  which  might  include  values  for  the  program's  input  variables  and 
definitions  of  some  standard  functions.  The  interpreter  runs  the  program  with  the  given  environment 
and  produces  the  desired  result  of  the  computation  --  the  final  answer,  which  might  be  some  number,  a 
character  string,  or  a  more  complex  object  such  as  a  file. 

In  the  compiled  case,  the  source  program  is  first  translated,  by  a  compiler,  into  an  object  program  in 
some  other  language  L-;  this  step  is  called  compiletime.  An  L’  interpreter  then  runs  the  object  program 
with  the  desired  environment  to  produce  the  final  answer:  this  step  is  called  runtime.  The  object 
program  may  be  stored  and  run  repeatedly  using  different  environments  or  inputs,  without  re¬ 
translating  the  source  program. 

Tms  work  concerns  the  compiled,  rather  than  the  interpreted,  model. 


Figure  1-1: 

Two  Models  of  Program  Evaluation 
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1.11.1.  Definition:  Runtime  Type  Errors 

Suppose  the  source  program  contains  a  mistake,  causing  the  L  interpreter  (in  the  interpreted  case)  or 
the  L’  interpreter  (in  the  compiled  case)  to  try  to  apply  some  erroneous  operation,  such  as  multiplying 
two  character  strings.  It  may  be  detected  by  the  interpreter,  and  an  error  message  issued,  or  it  may  not 
be  detected,  in  which  case  the  result  of  the  computation  will  be  garbage.  In  either  case,  it  is  called  a 
runtime  type  error  to  distinguish  it  from  any  errors  that  the  compiler  might  issue  before  the  program  is 
executed. 

1.11.1  Definition:  Strong  Typing 

If  the  source  program  contains  adequate  information  about  the  types  of  values  to  be  computed,  the 
compiler  can  ensure  that  the  generated  object  program  will  be  free  of  runtime  type  errors.  Strong 
typing  means  providing  an  a  priori,  or  compiletime,  guarantee  against  runtime  type  errors.1 

This  work  concerns  only  languages  providing  strong  typing. 

1.11  The  Purposes  of  Compiling 

There  are  two  basic  advantages  to  compiling  the  source  program,  as  opposed  to  interpreting  it 
directly:  type  security  and  efficiency. 

Type  security  Because  the  source  language  is  strongly  typed,  the  compiler  can  provide  an  a  prion 
guarantee  that  no  runtime  type  errors  will  occur  when  the  object  program  is 
executed  on  the  implementation  machine.  This  provides  an  assurance  that  the 
program  is  at  least  partially  correct,  without  executing  the  program. 

Efficiency  A  compiler  can  improve  a  program’s  runtime  efficiency  in  three  ways:  by  computing 

constant  expressions  at  compiletime:  by  selecting  optimal  object  program  code, 
based  on  values  and  types  known  at  compiletime;  and  by  translating  the  program 


^The  question  sometimes  arises:  Is  division  by  zero  considered  a  runtime  type  error11  What  about  an  array  index  out  of 
bounds'*  Or  an  attempt  to  read  beyond  the  end  of  the  input'* 

Paul  Eggert  [Egjert  81]  has  shown  that  it  is  possible  to  define  the  type  system  securely  enough  that  such  runume  errors  are  not 
possible  For  example,  one  car.  define  a  type  noB-zero-rartstrc  that  includes  ail  integers  except  rare,  and  another  type 
possibly -rero-iatcgers  that  includes  all  integers  Only  \aiues  of  the  type  aon-tero-tategen  would  be  allowed  as  dtxisors.  and  (for 
example)  subtraction  of  two  aoa-anro-iategers  would  yield  a  result  of  type  possibly  aero- integers  To  convert  a  value  of  type 
possiMynro-iaiegtrs  to  a  value  of  type  non- aero- integers,  one  must  use  a  special  case-conformity  clause  placing  the  detection  of 
a  zero  value  under  explicit  program  control.  The  lype  system  can  similarly  be  defined  in  such  a  way  that  array -index -out-of- 
bounds  and  other  such  errors  are  not  possible  Although  many  languages  that  purport  to  be  strongly  typed,  such  as  Pascal,  allow 
such  loopholes  in  the  type  sy  stem,  this  work  assumes  that  the  type  sy  stem  is  defined  securely  enough  that  such  runume  errors  are 
not  possible 


into  a  language  inherently  more  efficient  for  the  implementation  machine  to 


execute: 


1.23.  Problems  with  Traditional  Compiled  Languages 


The  benefits  of  compiling  are  well  established,  and  languages  specifically  designed  to  be  compiled  -- 
compiled  languages  --  are  common.  In  spite  of  these  advantages,  there  are  some  problems  with 
traditional  compiled  languages. 


1.13.1.  Lack  of  Programmer  Control 


Inherently,  the  compiler  must  know  a  great  deal  about  the  source  program  and  the  types  of  values 
being  manipulated  in  order  to  produce  an  efficient,  type-checked  object  program.  However,  the 
programmer  generally  does  not  have  access  to  much  of  this  compiletime  information. 


For  example,  in  Pascal,  there  is  no  way  to  ask  for  the  size  of  an  array  or  for  the  first  value  of  an 
enumerated  type.3  Certainly  the  compiler  has  this  information,  but  the  programmer  has  no  way  of 


accessing  it. 


1.13.1  Ad  Hoc  Notions 


It  is  easy  to  see  similarities  between  the  kinds  of  operations  performed  by  the  compiler  at 
compiletime.  and  the  operations  performed  under  program  control  at  runtime.  In  spite  of  the 
conceptual  similarities,  compiletime  notions  tend  to  be  ad  hoc.  For  example,  the  shortcomings  of 
Pascal  mentioned  above  were  addressed  in  Ada4  by  supplying  attribute  operations,  which  ask  for  an 
array's  size  or  an  enumerated  type’s  first  value.  The  Ada  Reference  Manual  [Ada  82]  defines  48  such 
attributes!  Some  of  these  attributes  are  computable  at  compiletime  and  some  are  not. 


Type  expressions  are  usually  treated  very  differently  from  other  ■■  conventional  --  expressions,  such 
as  numeric  expressions.  In  fact,  they  usually  have  different  syntactic  rules.  Consider  Pascal.  One  can 

define  a  xariable  x  to  be  some  user-defined  type  t: 

var  x:  t;  {  t  is  some  user-defined  type  } 

Or  one  can  declare  x  using  an  array  type  expression  involving  t: 


Thu  third  method  of  improving  efficiency  »n!l  be  ignored  when  we  generalize  compiicume  to  arrive  ti  the  nouon  of  phases 
However  the  idea  of  translating  to  a  more  efficient  language  :s  not  incompatible  with  the  notion  of  phases  Instead  of  executing 
an  Implementation  Language  program  directh ,  we  could  first  translate  n  to  another  more  efficient!)  executed  language 


An  enumerated  type  is  a  type  for  which  all  values  are  explicit!)  listed,  for  example,  type  color  *  (red.  green,  blue). 


\da  is  a  registered  trademark  of  the  L  S  Government.  Ada  Joint  Program  Office 
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var  x:  array  [1..20]  of  t; 

However,  one  cannot  compute  an  arbitrary  function  of  t : 
var  x:  f(t) ;  {  Illegal  } 

To  various  extents,  some  languages,  such  as  Donahue’s  Extended  Lambda  Calculus  [Donahue  79], 
Russell  [Boehm  80],  ELI  [Wegbreit  74]  and  Pebble  [Burstall  84],  do  treat  types  as  first-class  values-,  that 
is,  one  may  use  type  variables  and  write  functions  and  expressions  involving  types.  However,  these 
languages  tend  to  syntactically  separate  type  expressions  from  normal  expressions,  restrict  the  kinds  of 
computations  allowed  on  types  to  ensure  that  the  type  values  are  statically  computable,  or  forego 
strong  typing  and  use  runtime  type  checking.  Pebble’s  treatment  of  types  is  general  and  uniform  in 
these  respects,  but  it  does  treat  one  aspect  of  types  differently,  as  mentioned  in  Section  1 .2.3.4. 

1.2.3.3.  The  Conflict  Between  "Strong  Typing"  and  "Types  as  First-Class  Values" 

The  motivation  for  allowing  types  as  first-class  values  is  clear:  the  abilities  to  parameterize  by  types, 
use  arbitrary  algorithms  to  construct  new  types,  and  make  decisions  based  on  types,  would  support 
more  reusable  software.  Similarly,  the  benefits  of  strong  typing  are  well  established:  type  security  and 
efficiency. 

Unfortunately,  there  is  an  inherent  conflict  between  allowing  types  as  first  class  values  and  the  desire 
for  strong  typing.  Basically,  strong  typing  requires  that  the  type  of  every  expression  be  know  n  before 
runtime.  However,  allowing  types  as  first-class  values  means  that  types  may  involve  arbitrary 
expressions,  use  variables,  invoke  functions,  depend  on  input,  etc. 

1.2.3.4.  Different  Mechanisms  for  Type  Checking  and  Evaluation 

Type  checking  is  similar  to  program  evaluation.  The  similarity  is  readily  apparent  when  one 
compares  a  tv  pica]  language's  semantic  rules  for  type  checking  with  its  semantic  rules  for  evaluation: 
both  draw  conclusions  about  an  expression  s  value  or  type  based  on  the  values  or  types  of  the 
expression's  subexpressions,  and  both  follow  lexical  scoping  rules  for  identifiers. 

Nonetheless,  strongly  typed  languages  have  invariably  defined  separate  mechanisms  for  type 
checking  and  program  evaluation.  For  example,  even  in  Burstall  and  Lampson's  Pebble  [Burstall  84], 
though  type  checking  involves  evaluation,  a  different  mechanism  is  used  for  type  checking  than  for 
evaluation.  This  is  shown  clearly  in  Table  6.  Section  5.3  of  Pebble  [Burstall  84].  where  the  type 
checking  rules  are  separated  from  the  evaluation  rules  to  form  what  is  essentially  a  different  machine. 
(The  rules  are  separated  to  demonstrate  the  distinction  between  the  act  of  type  checking  and  the  act  of 


evaluation.)  Both  sets  of  rules  apply  to  the  same  language  constructs,  but  they  are  applied  at  different 
times,  depending  on  whether  the  program  is  being  type  checked  or  executed. 

1.3.  This  Research 

Can  types  and  code  be  manipulated  effectively  under  programmer  control  during  compiletime,  while 
retaining  strong  typing?  Can  compiletime  notions  such  as  type  checking  be  unified  with  runtime 
notions? 

The  answer  is  "Yes."  The  language  notion  of  multiple  strongly  typed  evaluation  phased  unifies 
compiletime  and  runtime,  and  allows  types  and  code  to  be  manipulated  as  first-class  values,  while 
retaining  strong  typing.  Types,  manipulated  as  first-class  values  in  one  phase,  become  invariants  of  the 
next  phase,  as  explained  in  Section  3.3.2.  Phases  might  also  be  used  to  perform  partial  evaluation,  as 
discussed  in  Section  5.  The  purpose  of  this  work  is  to  explore  and  introduce  the  notion  of  multiple 
strongly  typed  evaluation  phases.6 

Our  particular  approach  to  type  checking  was  motivated  by  certain  key  biases: 

-  A  firm  belief  in  strong  typing,  that  is,  in  providing  an  a  priori  guarantee  that  a  program  is 
free  of  any  possible  runtime  type  errors. 

-  A  desire  to  unify  the  notions  of  compiletime  and  runtime. 

-  An  orientation  toward  explicit  programmer  expression  rather  than  inference  performed  by 
the  language  implementation.  These  orientations  are  contrasted  in  Section  1.3.1. 

-  A  desire  to  support  the  general  programming  method  described  in  Section  2.1. 


'The  term  phases  is  often  used  in  this  work  instead  of  the  longer,  more  descriptive  term  multipie  strongly  typed  evaluation 
phases 

6This  work  was  approached  a  little  differently  than  most  doctoral  research  Rather  than  first  carefully  defining  a  problem  and 
then  seeking  a  solution,  we  pursued  an  interesting  idea  and  developed  it  to  see  how  u  might  be  useful  This  unusual  approach  is 
nskv.  because  there  is  less  assurance  of  a  useful  outcome,  and  it  places  a  greater  burden  on  the  researcher  for  scholarly  review 
and  integration  of  related  work.  Nonetheless,  this  approach  should  be  encouraged  much  more  The  traditional  approach  of 
defining  a  problem  and  then  seeking  a  solution  is  contrary  to  creativity  because  every  problem  definition  presupposes  a  certain 
view-  of  the  w'orld  The  most  interesting  and  innovative  developments  are  those  that  change  one  s  view  of  the  world,  making 
problems  inclc\art  instead  of  solving  them 
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1.3.1.  Expressing  versus  Inferring 


Programming  languages  are  designed  under  two  competing  orientations:  the  programmer  can 
express  information,  or  the  language  implementation  can  infer  the  information.  The  Phi  language 
described  in  Section  4.1  is  strongly  oriented  toward  expressing  rather  than  inferring.  This  section 
explains  this  choice  and  the  differences  between  the  two  orientations. 

For  example,  rather  than  requiring  the  compiler  to  infer  the  type  of  an  expression  from  its  context,  in 
Phi  the  type  is  simply  computed  as  any  other  computation.  Another  example  of  this  distinction  is  that 
type  checking  polymorphic  functions  in  ML  (Gordon  79]  involves  unification,  a  process  of  pattern 
matching  to  find  the  most  general  type  solution.  If  the  same  kind  of  polymorphic  functions  were 
offered  in  a  language  oriented  toward  programmer  expression,  the  programmer  would  have  the 
responsibility  of  expressing  the  desired  type  solution,  and  the  language  should  provide  useful  ‘type 
operations  to  make  this  easy.  This  is  like  the  difference  between  proof  checking  and  proof  discovery. 

13.1.1.  Advantages  of  Inference  over  Expression 

The  main  argument  for  having  the  compiler  infer  whatever  it  can  is  that  it  reduces  the  burden  on  the 
programmer.  This  is  a  good  argument,  but  it  is  not  prima-facie  evidence  that  compiler  inference  is 
preferable  to  language  expressiveness.  It  does,  however,  point  out  that  ease  of  expression  is  very 
important  Concise  syntactic  constructs  and  libraries  of  reusable  components  should  be  provided  to 
make  expression  easy. 

Another  argument  for  having  the  compiler  infer  information  is  that  the  inferences  are  assured  correct 
(assuming  that  the  compiler  is  correct,  and  that  the  programmer  understands  the  inferences).  If  the 
programmer  is  given  the  responsibility  of  computing  the  types  of  expressions,  for  example,  it  is 
conceivable  that  the  programmer  would  occasionally  make  a  mistake  and  compute  the  wrong  type, 
thus  allowing  an  operation  to  be  applied  erroneously.  In  this  case  (to  ensure  strong  typing),  if  the 
programmer  is  allowed  to  compute  types  arbitrarily,  it  is  clear  that  the  compiler  must  have  some  wav  of 
verify  ing  that  any  computed  types  are  in  fact  legal. 

I.3.I.2.  Disadvantages  of  Inference  as  Opposed  to  Expression 


One  disadvantage  of  relying  on  the  compiler  to  infer  information  is  that  the  compiler  must  be  more 
complex.  Thus,  compilation  may  involve  such  usks  as  unification  or  solving  systems  of  simultaneous 
equations. 


Perhaps  the  most  important  disadvantage,  though,  is  that  the  programmer  may  want  to  express 
things  that  the  compiler  is  not  capable  of  inferring.  This  may  be  viewed  as  both  a  theoretical  and  a 
practical  problem.  As  a  simple  example  of  the  theoretical  difficulty,  suppose  that  every  expression  in 
the  language  must  be  guaranteed  to  halt,  and  that  this  is  considered  part  of  the  expression’s  type 
correctness.  The  halting  problem  shows  that  this  is  theoretically  impossible  for  the  compiler  to 
algorithmically  determine,  however,  a  compiler  could  much  more  easily  verify  a  proof  supplied  by  the 
programmer.  As  another  example  of  the  theoretical  difficulty,  Coppo  [Coppo  80]  asserts  that  when  the 
type  system  of  ML  [Gordon  79]  is  extended,  the  question  of  whether  a  term  possesses  a  type  becomes 
only  "semi-decidable". 

The  practical  difficulty  is  that  the  compiler  may  not  be  smart  enough  to  allow  constructs  that  the 
programmer  may  wish  to  express.  And  unfortunately,  making  the  compiler  smarter  generally  makes  it 
more  complex. 

I.3.I.3.  The  Gray  Area  Between  Inference  and  Expression 

There  is  no  rigid  distinction  between  inference  and  expression.  For  example,  under  the  expressive 
orientation,  a  library  routine  implementing  an  inference  engine  could  be  provided.  Or  conversely,  a 
language  implementation's  inference  rules  could  simulate  expression  evaluation.  Language  processors 
generally  contain  elements  of  both  inference  and  expression. 

The  work  presented  here  is  based  on  a  strong  bias  toward  expression,  tempered  with  the  compilcume 
checks  necessary  to  ensure  that  any  computed  type  values  are  legal.  We  do  not  intend  to  argue  that 
expression  is  unequivocally  better  than  inference.  We  are  simply  pointing  out  the  importance  of  this 
orientation  with  respect  to  this  work. 

1.4.  Related  Work 

1.4.1.  Pebble 

The  Pebble  language,  by  Burstall  and  Lampson  (Burstall  84],  uniformly  allows  types,  bindings,  and 
declarations  as  first-class  values.  Pebbles  bindings  are  name-value  pairs:  they  are  essenually 
environments.  Giving  explicit  access  to  bindings  as  first-class  values  makes  it  easy  to  build  and  access 
libraries  or  modules  of  reusable  functions  or  other  values  under  programmer  control.  Pebble's 
declarations  are  the  types  of  bindings. 


!0 


For  simplicity,  and  to  focus  attention  only  on  the  notion  of  phases,  in  Phi  we  do  not  provide  bindings 
and  declarations  as  first-class  values,  though  they  would  be  very  interesting  to  add.  The  idea  fits  our 
general  philosophy  perfectly 7 

Pebble  also  provides  dependent  types  (see  Section  6.2).  though  our  Phi  language  does  not.  The  need 
for  them  in  Phi  is  somewhat  reduced  by  the  notion  of  multiple  phases;  this  is  discussed  in  Section  6.2. 

Pebble  deals  with  language  ideas,  whereas  the  notion  of  strongly  typed  evaluation  phases  might  be 
more  accurately  characterized  as  a  language  implementation  idea.  As  such.  Pebble’s  semantic  rules 
have  no  rigid  separation  between  evaluation  stages  representing  compiletime  and  runtime.  However. 
Pebble’s  type  checking  and  evaluation  rules  can  be  separated  to  provide  static  type  checking.  This 
separation  essentially  leads  to  different  machines  (that  are  applied  to  the  same  program)  for  doing  type 
checking  and  evaluation.  In  contrast,  our  work  provides  a  single  machine  that  performs  both  roles  of 
type  checking  and  evaluation,  depending  on  the  expressions  in  the  program.  To  clarify  this  distincuon. 
in  Pebble,  whether  a  program  is  being  type  checked  or  evaluated  depends  on  the  set  of  rules  applied  -- 
it  does  not  depend  on  the  program  itself.  Whereas  in  our  work,  the  syntax  of  the  program  determines 
whether  our  single  type-checking-and-evaluation  machine  will  do  type  checking  or  conventional 
evaluation. 


I 


Li 


1.4.2.  Partial  Evaluation 

Partial  evaluation  is  variously  also  known  as  symbolic  evaluation,  partial  execution  symbolic 
execution,  or  mixed  computation.  Ershov  [Ershov  77a]  [Ershov  82]  has  probably  been  its  main 
proponent. 

1.4.2. 1.  Definition  of  Partial  Evaluation 


Partial  evaluation  reduces  one  program  to  another  equivalent  program  in  which  some  parts  of  the 
first  program  have  been  evaluated  or  simplified.  For  example,  the  expression  a~b~2*3  might  be 
reduced  to  the  equivalent  expression  a*  b^  6.  Or,  if  a  value  of  5  is  provided  for  variable  b.  expression 
a-rb~2m3  might  be  partially  evaluated  to  a  *  1 1. 


In  general,  if  one  or  more  of  a  program's  input  parameters  are  constant,  the  program  may  be  pantally 
evaluated  to  produce  a  new.  more  efficient  program  by  taking  advantage  of  those  known  constant 
values. 


In  (act  the  first  implementation  of  phases  did  treat  bindings  arid  declarations  as  first-class  tallies 
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1.4.11  Uses  of  Partial  Evaluation 


Partial  evaluation  has  mainly  been  used  as  a  flexible  mechanism  for  specializing  programs.  The 
purpose  has  generally  been  to  produce  a  more  efficient  resulting  program  --  part  of  the  computation 
has  been  done  already.  This  efficiency  motive  is  one  of  the  two  basic  reasons  for  compiling  programs 
as  opposed  to  interpreting  them  directly.8  However,  the  advantage  of  partial  evaluation  over 
compilation  is  its  flexibility  --  any  subset  of  a  program's  free  variables  (or  inputs)  can  be  fixed  by 
supplying  particular  values  for  them.  Gifford.  Schooler,  et  al.  [Schooler  84]  are  also  working  on  using 
partial  evaluation  to  perform  type  checking;  this  correctness  motive  is  the  other  basic  reason  for 
compiling. 

Partial  evaluation  is  also  useful  in  separating  notation  from  data  representation.  For  example,  in 
Pascal,  the  syntax  for  accessing  data  is  tied  to  the  representation  of  the  data,  making  it  difficult  to 
change  data  representations.  The  programmer  must  choose  between  representing  some  data  as  a 
function  or  in  a  record,  a  linked  list,  or  an  array,  and  the  syntax  for  accessing  the  data  reflects  this 


choice: 

a(b) 

Function  invocation. 

a.b 

Accessing  a  component  of  a  record- 

at.b 

Accessing  through  a  pointer  variable. 

*lb] 

Array  subscripting. 

Function  invocation  is  the  most  general  case,  because  any  kind  of  data  structure  can  be  hidden  inside 
the  function  body.9  Why  shouldn't  the  programmer  always  hide  the  data  structure  inside  a  function? 
The  answer  is  the  traditional  high  cost  of  function  invocation.  But  using  partial  evaluation,  the 
function  call  can  be  avoided  by  beta-expanding10  (also  called  beta- reducing)  the  function  call  in-line, 
thus  eliminating  the  performance  justification  for  using  specialized  notation.  Beta-expanding  recursive 
functions  can  be  a  problem  in  general,  but  since  (at  the  moment)  we  are  simply  discussing  the 
possibility  of  hiding  data  structure  access  inside  of  function  calls,  recursive  functions  are  not  an  issue 
here. 

^Section  1.2  2  outlines  the  basic  purposes  of  compiling 

9 

Actually,  m  Pascal,  only  scalar  types  can  be  relumed  by  a  function  Howe'er  other  languages  do  not  hate  this  restriction 

^  Beta  expansion  or  beia  reduction  replaces  a  function  call  with  the  function  s  body .  having  substituted  actual  parameters  for 
formal  parameters  in  the  body  Care  must  be  taken  to  preserve  the  properties  of  lexical  scoping  Beta  expansion  is  similar  to 
tnac-o  expansion,  except  that  macro  expansion  does  not  always  guarantee  that  the  properties  of  lexica!  scoping  are  preserved 
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I.4.2.3.  Comparing  Phases  and  Partial  Evaluation 


As  developed  in  Chapter  4,  phase  evaluation  differs  from  partial  evaluation  in  two  important  ways: 
(1)  a  program’s  various  phases  are  explicitly  indicated  in  the  application  program,  and  (2)  program 
fragments  can  be  manipulated  as  first-class  values  of  an  abstract  data  type  (the  data  type  ERT).  The 
latter  difference  gives  a  macro-like  capability,  and  the  primitive  operations  that  implement  the  abstract 
data  type  ensure  that  all  generated  programs  are  type  correct 

The  development  in  Chapter  5  shows  how  modifying  and  restricting  phases  might  result  in  a  system 
that  essentially  performs  partial  evaluation.  Section  5.4  proposes  a  "phase  compiler"  approach  that  is 
analogous  to  the  "compiled  generation"  approach  of  Beckman,  et  al.  [Beckman  76],  but  ours  applies  to 
strongly  typed  languages,  whereas  theirs  applied  to  the  untyped  language  LISP  [McCarthy  66],  This 
approach  allows  one  abstract  machine  to  efficiently  perform  both  "partial"  evaluation  and  "full" 
evaluation. 

If  phases  were  adapted  to  perform  partial  evaluation  as  discussed  in  Chapter  5,  the  most  important 
remaining  differences  between  phase  evaluation  and  partial  evaluation  would  be  that:  (1)  phase 
evaluation  syntactically  distinguishes  between  those  portions  of  a  program  that  are  being  "partially" 
evaluated  and  those  that  are  being  "fully"  evaluated,  thus  allowing  the  phase  evaluator  to  perform  both 
"partial"  and  "full"  evaluation  efficiently;  and  (2)  under  phase  evaluation,  a  program’s  result  type  is 
always  known  before  the  program  is  evaluated. 

1.4.3.  Current  Work  by  Gifford,  Schooler,  et  al. 

Gifford.  Schooler,  et  al.  apparently  assume  a  similar  general  programming  method  to  ours  (described 
in  Section  2.1).  Their  "kernel"  language,  the  Imagine  Base  Language  (IBL).  corresponds  to  our 
Implementauon  Language  (IL).  Their  programming  method  also  assumes  a  partial  evaluator, 
compilers,  and  interpreters,  whereas  ours  includes  a  single  Implementation  Language  Machine:  our 
programming  method  makes  explicit  the  operation  of  combining  programs  to  form  new  programs, 
whereas  theirs  does  not. 


Their  approach  to  providing  an  extensible  yet  efficient  language  is  based  on  partial  evaluation: 
specially  defined  forms  can  be  convened  to  simpler,  more  efficient  forms  by  partial  evaluation.  From 
Schooler  [Schooler  84]: 

Our  proposed  methodology  is  a  generalization  of  the  Russell  [Boehm  80]  and 
ELI  [Wegbreit  74]  techniques:  all  [language]  extensions  are  implemented  in  the  language. 
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allowing  full  user  access  to  the  extension  mechanism.  In  addition,  partial  evaluation  will  be 
used  to  optimize  the  code  to  the  point  where  using  the  user-defined  extension  mechanisms 
is  essentially  free  in  terms  of  runtime  performance. 

Gifford,  Schooler,  et  al.  also  use  their  "front  end"  translators  to  insen  assertions  into  the  kernel 
language  (IBL)  code,  and  use  the  partial  evaluator  to  compute  as  many  of  these  assertions  as  possible. 
Since  type  checking  is  handled  by  inserting  assertions  about  types,  they  thus  provide  compiletime  type 
checking  where  possible  and  runtime  type  checking  where  necessary.  Again  from  Schooler  [Schooler 
84): 

The  code  which  the  partial  evaluator  acts  on  will  be  generated  by  syntactic  transforms 
from  surface  language  constructs.  The  generated  code  will  preserve  all  user-specified  side- 
effects  but  will  also  include  applicative  constructs  for  type  checking,  etc. 

Finally,  since  Gifford,  Schooler,  et  al.  are  using  partial  evaluation,  the  comments  on  partial 
evaluation  given  in  Section  1.4.2.2  apply  to  their  work  as  well. 


I 
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Chapter  2 

Programming  Method 

This  chapter  discusses  an  assumed  programming  method.  This  programming  method  is  very  simple 
and  rudimentary,  and  is  not  the  focus  of  the  research.  It  is  included  only  to  provide  the  necessary 
framework  for  discussing  the  main  thesis  of  this  work:  the  notion  of  phases. 

The  reader  wishing  to  skim  this  chapter  must  be  sure  not  to  skip  over  Section  2.2.1.  which  defines 
ERTs,  and  is  essential  to  subsequent  chapters. 

2.1.  General  Programming  Method 

The  general  programming  method  shown  in  Figure  2-1  illustrates  how  programs  (or  program 
fragments)  may  be  used  to  create  other  programs.  There  are  three  essential  aspects,  described  in  the 
following  sections. 

11.1.  Distinct  Application  and  Implementation  Languages 

First,  the  general  programming  method  assumes  that  humans  write  source  programs  (or  fragments) 
in  an  application  language  (Phi)  that  is  syntactically  convenient  for  humans,  and  that  these  programs 
are  then  translated  mto  an  implementation  language  (IL)  that  is  more  convenient  for  mechanical 
interpretation.  This  prevents  the  programmer  from  directly  writing  ill-formed  programs  in  the 
common  implementation  language.  Because  all  programs  in  the  implementation  language  are 
generated  and  manipulated  mechanical!).  the\  can  be  guaranteed  to  ha\e  certain  properties:  in 
particular,  to  be  syntactically  correct  and  to  be  free  of  possible  runtime  type  errors.  (Runtime  type 
errors  w  ere  defined  in  Section  1.2.1. 1.) 


m 

M 


/*  • 


This  work  defines  two  versions  of  a  simple  application  language.  Phi.  and  a  simple  implementation 
language.  IL.  A  Phi  Translator,  which  translates  from  Phi  to  IL.  is  also  defined. 
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General  Programming  Method 
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2.1.2.  Programs  Are  Combined 


Second,  the  programming  method  assumes  that  useful  programs  in  the  common  implementation 
language,  possibly  from  libraries,  may  be  combined  to  form  new  programs.  In  this  way.  various 
software  components  could  be  reused. 

The  operation  of  combining  IL  programs  is  not  defined  here.  It  is  assumed  to  be  handled  by 
whatever  particular  programming  method  the  programmer  uses,  and  is  not  essential  for  discussing  the 
notion  of  phases.11 

2.1.3.  Programs  Are  Instantiated 

Finally,  the  programming  method  assumes  that  a  program  can  be  specialized,  instantiated,  refined, 
or  evaluated  to  form  various  versions  or  to  compute  the  final  answer.  An  entire  tree  of  versions  might 
be  derived.  This  aspect  is  consistent  with  notions  of  transformational  implementauon  [Cheatham  81). 
mechanized  top-down  stepwise  refinement,  and  partial  evaluation  [Ershov  77a],  It  also  means  that  a 
version  might  be  generated  that  would  gather  program  performance  statistics,  and  these  statistics  could 
be  used  in  automatically  instantiating  a  more  efficient  version  for  those  data  characteristics  [Balzer  83]. 

Instantiation  is  defined  in  this  work  by  the  semantics  of  the  Implementation  Language  (IL).  that  is. 
by  the  IL  Machine. 

2.2.  Specific  Programming  Method 

Be  fore  describing  the  notion  of  phases,  let  us  first  discuss  the  assumed  programming  method  more 
specifically  as  it  relates  to  the  succeeding  description  of  phases.  Figure  2-2  illustrates  the  specific 
programming  method.  It  involves  application  programs  written  in  Phi,  a  Phi  Translator.  IL  programs 
m  the  form  of  ERTs  (defined  below),  and  an  IL  Machine. 


1  Nonuheiess  the  special  data  type  ERT.  described  m  Section  2  2  i  and  die  example  languages  SiaucPhi  and  Stauc-IL 
desenbed  in  Chapicr  4  make  it  easy  io  manipulate  and  combine  t'pe-checked  program  fragments  with  integrity  under  program 
control  Ir.  fact,  that  is  precisely  the  purpose  of  the  unusual  (check* )  constructs  of  Sta’uc-IL  listed  in  Section  4  2.3:  they  lake 
■  pt-cheaed  II  programs  (in  the  form  of  ERTsl  and  combine  them  'o  produce  new  tvpe-cnecked  IL  programs  A  combining 
program  would  thus  lake  ERT  values  as  input  (from  me  environment)  and  produce  an  ERT  value  Section  4  3  discusses  the 
environments  required  by  Siauc-IL  programs  and  shows  examples  of  ERT  values 


P 


Figure  2-2:  Specific  Programming  Method 


12.1.  ERT:  Expression,  Required-environment,  Type 


In  order  to  interpret  the  specific  programming  method  shown  in  Figure  2-2.  we  must  first  define  a 
special  data  type  for  representing  type-checked  program  fragments:  the  data  type  ERT.  An  ERT  is  a 
triplet  having  the  following  components: 

Expression  An  expression  in  the  Implementation  Language. 

Required-environment 

A  list  of  each  free  variable  appearing  in  the  Expression  component,  paired  with  its 
type.  Each  free  variable  is  listed  once,  with  one  type,  and  no  other  variables  are 
listed. 

Type  The  Expression  component  will  evaluate  to  a  value  of  this  type. 

The  purpose  of  ERT  triplets  is  to  facilitate  manipulating  programs  (expressions),  both  in  the  overall 
programming  method  and  in  the  implementation  language,  while  ensuring  their  integrity.  We  are  not 
interested  in  just  any  conceivable  <e,r,t>  triplet  -  only  those  that  are  meaningful,  or  valid,  as  defined 
below.12 

2.2.2.  Valid  ERTs 

An  ERT  <e.r.i>  is  valid  if  expression  e,  evaluated  in  an  environment  that  satisfies  the  required- 
environment  r,  is  guaranteed  to  evaluate  to  a  value  of  type  t.  By  "an  environment  that  satisfies  the 
required-environment"  we  mean  an  environment  env  such  that  for  each  vanable-typc  pair  <v./>  listed 
in  required-environment  r,  variable  v  is  bound  to  a  value  of  type  t  tn  env. 

Every  ERT  generated  by  the  Phi  Translator  or  the  IL  Machine  is  valid.1-' 


^David  MacQueen  and  John  Mitchell  have  aptly  pointed  out  that  a  valid  ERT  correspond'  ciosel'  to  the  notion  of  a  rvpirg 
To  quote  Reynolds  [Reynolds  85]: 

"Let  e  be  an  expression,  v  (often  called  a  type  assignment )  be  a  mapping  of  (at  least)  the 
identifiers  occurring  free  in  e  into  types,  and  o  be  a  type.  Then 
it  e:  o 

is  called  a  typing,  and  read  'e  has  type  o  under  w  ." 

Vote  that  this  interpretation  is  assuming  a  particular  deduction  or  evaluation  mechanism  represented  b'  the  symbol  {It 
might  be  more  precise  to  subscript  this  sy  mbol  with  the  name  of  the  deduction  mechanism  such  as  i  Similarly .  there  is  a 
corresponding  implied  deduction  mechanism  for  ERT  triplets  which  is  given  by  the  semantic  rules  for  interpreting  expressions 
in  the  Implementation  Language 

^To  prove  this  assertion  would  be  quite  tedious  The  last  section  of  Appendix  A  includes  a  brief  sketch  of  how  io  approach 
proving  it. 
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12.3.  Interpreting  the  Specific  Programming  Method 


First,  the  programmer  writes  a  Phi  program. 

Next,  the  programmer  invokes  the  Phi  Translator  to  translate  this  program  to  a  valid  ERT  (i.e.  an  IL 
program). 

The  programmer  might  next  use  some  method  of  combining  various  ERTs  to  create  a  new  ERT. 

Then,  the  programmer  creates  an  appropriate  environment,  and  evaluates  the  ERT  by  invoking  the 
IL  Machine  on  this  environment  and  the  Expression  component  of  the  ERT.  The  environment 
supplies  the  input,  and  must  include  values  of  the  proper  types  for  all  free  variables  in  the  expression. 
(That  is.  the  environment  must  satisfy  the  Required-environment  component  of  the  ERT,  as  discussed 
in  Section  2.2.1.  The  command  interpreter  used  to  invoke  the  IL  Machine  must  enforce  this,  as 
discussed  in  Section  2.2.4.) 

The  Type  component  of  the  ERT  tells  what  type  of  value  the  IL  Machine  will  produce,  assuming  no 
"compiletime"  error  occurs.  (If  such  an  error  does  occur,  one  can  either  think  of  the  IL  Machine  as 
returning  some  special  error  value  distinct  from  all  other  legitimate  values,  or  as  returning  nothing  at 
alt.  since  evaluation  is  aborted.)  If  the  Type  component  is  ert,  the  result  will  be  another  ERT; 
otherwise  it  will  be  some  final  answer  --  a  number,  for  example.14  Thus,  one  knows  beforehand 
whether  the  result  of  executing  each  IL  program  will  be  another  ERT  (another  program)  or  a  final 
answer. 

The  case  when  the  IL  Machine  produces  another  ERT  is  especialh  interesting.  Since  an  ERT 
contains  an  expression  in  the  Implementauon  Language,  the  IL  Machine  can  be  viewed  as  specializing, 
instantiating,  or  (possibly)  partially  evaluating  a  program,  as  in  the  General  Programming  Method 
(Section  2.1).  But  it  can  also  be  viewed  as  compiling  a  program,  though  the  source  and  object 
languages  are  the  same.  This  is  further  explained  m  Chapter  3. 

Note  that  it  is  trivial  to  determine  whether  the  result  produced  b>  the  IL  Machine  will  be  a  final 
answer:  the  Type  component  of  an  ERT  specifies  the  type  of  value  that  will  be  produced  then  the 
Expression  component  is  evaluated.  Thus,  if  the  Type  component  is  anything  other  than  the  literal 


4A  final  answer  is  defined  as  an\  \alue  other  than  ar.  ERT  ih3i  is  »  is  noi  another  program  It  might  be  a  number  boolean 
string  or  other  such  basic  'alue  Conceptual!)  a  final  answer  might  be  an  enure  file  though  in  our  simple  pedagogical 
languages  it  will  not 


,  j*  f 


«rt,  the  result  of  the  phase  evaluating  the  expression  will  be  a  final  answer.  This  is  evident  in  the 
examples  of  Section  4.4. 


2.2.4.  A  Command  Interpreter 

Certain  aspects  of  the  programming  method  shown  in  Figure  2-2  must  be  done  by  the  human.  For 
example,  the  human  must  write  the  original  Phi  program,  invoke  the  Translator  on  it,  combine 
program  fragments  (ERTs)  as  desired,  supply  the  desired  environment,  and  invoke  the  IL  machine  on 
the  Expression  component  of  the  desired  ERT.  The  simplest  method  of  doing  these  things  is  to 
provide  a  command  interpreter  **  most  naturally  written  in  the  Phi  language  itself  --  and  this  is  what  we 
will  assume,  though  any  other  more  automated  method  is  possible  as  well.  It  is  the  command 
interpreter’s  responsibility  to  ensure  that  the  expression  and  environment  actually  given  to  the  IL 
Machine  are  syntactically  correct,  type  correct,  and  compatible.  However,  the  use  of  ERTs  makes  this 
very  easy  to  enforce,  especially  since  every  ERT  produced  by  the  Phi  Translator  or  the  IL  Machine  is 
guaranteed  to  be  syntactically  correct  and  type  correct,  and  the  Required-environment  explicitly  lists 
the  identifiers  and  types  of  values  required  in  the  environment. 

2.2.5.  Environments 

An  environment  simply  provides  bindings  of  identifiers  to  values.  As  shown  in  Figure  2-2.  along 
with  each  IL  program  (EXPR).  the  IL  Machine  must  be  given  an  environment  (ENV)  that  supplies 
values  of  the  correct  type  for  all  free  variables  in  the  IL  program  (EXPR).  In  our  simple  model,  an  IL 
program's  input  must  be  supplied  via  the  environment:  that  is,  the  IL  program  might  have  a  free 
variable  representing  the  program’s  input,  and  the  environment  would  have  to  supply  a  value  for  that 
free  variable,  thus  providing  the  program's  input 

For  example,  consider  the  following  trivial  program  that  computes  the  cosine  of  a  number,  x. 

(cos  \) 

The  free  v  ariable  x  represents  the  program's  input  and  the  free  variable  cos  refers  to  a  standard  cosine 
trigonometric  function.  Thus,  the  environment  for  this  program  must  be  constructed  to  include 
bindings  for  x  (a  number),  and  cos  (a  function  from  numbers  to  numbers).  Typically,  the  binding  for 
cos  would  come  from  a  standard  library,  whereas  the  binding  for  x  would  be  explicitly  provided  by  the 


user. 


We  do  not  show  how  environments  are  generated,  but  the  command  interpreter  can  provide  ways  of 
creating,  combining,  and  storing  environments,  while  keeping  track  of  the  types  of  the  variables 
defined  in  them.  Pebble  (Burstall  84).  for  example,  uses  bindings  as  first-class  values,  and  provides 
operations  for  creating  and  combining  them. 


Section  4.3  explains  more  about  the  environments  required  for  Static-IL  programs.  (Static-IL  is 
discussed  in  Section  4 .2.) 

2.3.  Motivating  Example:  General  Purpose  Sorting  Function 

This  section  describes  a  hypothetical  example  of  how  general-purpose  reusable  programs  might  be 
created  and  used.  The  purpose  of  this  example  is  to  provide  a  tangible  goal  to  guide  the  reader's 
intuition  through  the  rest  of  this  work,  where  the  notion  of  phases  is  explained.  The  reader  may  wish  to 
skip  this  section  at  first,  and  return  to  it  later  as  needed. 

Bear  in  mind  that  the  languages  discussed  in  this  work  are  provided  for  pedagogical  purposes  only. 
They  would  not  be  practical  for  real-life  applications  such  as  the  motivating  example  described  in  this 
section.  However,  these  pedagogical  languages  should  demonstrate  the  basic  semantic  notions 
necessary  in  a  full,  usable  language  that  could  be  practically  applied  to  the  example  below. 

2.3.1.  The  Desire  for  a  General*  Purpose  Sorting  Function 

Consider  the  problem  of  providing  a  truly  general-purpose  sorting  function.  Such  a  function  should 
be  able  to  efficiently  handle  a  wide  range  of  sorting  needs,  from  sorting  a  small  fixed  number  of  items 
in  the  computer's  primary  memory,  to  sorting  thousands  of  records  in  primary  memory,  to  sorting 
millions  of  records  in  secondary  memory  such  as  disk  or  tape.15 


Gearly.  it  is  impossible  for  a  single  sorting  function  to  fill  all  of  these  needs  efficiently  enough  to  be 
generally  useful,  because  there  are  many  different  algorithms  that  are  appropriate  for  different  needs. 
Any  single  program  that  tried  to  meet  all  needs  would  be  much  too  large  to  be  practical  for  the  smaller 
cases. 


This  example  comes  from  another  author,  but  we  have  been  unable  to  determine  whom 
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2.3.2.  A  Sorting  Function  Generator 

But  consider  a  sorting  function  generator.  This  generator  could  be  given  input  characterizing  a 
particular  sorting  need,  and  would  produce  a  sorting  function  custom-tailored  for  that  application. 
Input  to  the  generator  might  include  parameters  describing  the  data  types  to  be  sorted  where  the  data 
are  nored,  the  type  of  algorithm  to  be  used  or  even  a  characterization  of  the  generated  program's 
expected  input  data  distribution.  The  generator  would  use  this  information  to  choose  the  most 
appropriate  algorithm  (from  some  repertoire)  and  produce  the  most  efficient  data  structure 
declarations. 

An  automatically  generated  sorting  function  probably  would  not  be  quite  as  efficient  in  every  case  as 
a  sorting  function  that  a  programmer  could  write  from  scratch.  However,  it  could  be  good  enough  in 
most  cases  that  it  would  be  far  more  cost-effective  to  use  the  automatically  generated  version  than  to 
write  a  new  one.  This  is  a  fundamental  assumption  behind  the  desire  for  reusable  software. 

2.3.3.  Explicit  Generation  vs.  Partial  Evaluation 

The  sorting  function  generator  could  be  written  in  two  ways:  it  could  explicitly  manipulate  program 
fragments  for  the  generated  program,  or  it  could  be  written  as  one  big  parameterized  sorting  program 
that  is  partially  evaluated  to  produce  a  small  specialized  version.  Ignoring  the  lack  of  strong  typing  in 
LISP,  the  approach  of  explicit  manipulation  might  correspond  to  LISP  programs  that  construct  other 
LISP  programs  as  S-expressions.  In  the  partial  evaluation  approach,  the  language  would  have  to  allow 
types  as  first-class  values  so  that  data  type  declarations  could  be  parameterized  by  input  values,  and  the 
partial  evaluator  would  manipulate  program  fragments  to  produce  a  specialized  version  -•  the  program 
would  not  express  this  manipulation  explicitly.  Regardless  of  which  approach  is  taken,  the  important 
point  here  is  that  the  work  of  producing  the  specialized  or  generated  sorting  program  must  be  separated 
from  the  sorting  program  s  execution.  For  this  discussion,  we  will  assume  that  explicit  generation  is 
used. 

2.3.4.  Phases  Used 

Let  us  now  clearly  distinguish  between  the  act  of  executing  the  sorting  program  generator  and  the  act 
of  executing  the  generated  sorting  program.  These  executions  correspond  to  two  acts  of  instantiation. 
show  n  in  Figures  2-1  and  2-2,  or  two  phases,  n  and  n-r  l .  as  described  in  Chapter  3. 

To  ensure  strong  typing,  both  the  sorung  program  generator  and  the  generated  program  must  be 


guaranteed  against  runtime  type  errors.  In  the  phase  parlance  of  Chapter  3.  if  the  generator  is  to  be 
executed  in  phase  n.  it  can  be  type  checked  in  phase  n-1:  if  the  generated  program  is  to  be  executed  in 
phase  n+ 1.  it  can  be  type  checked  in  phase  n. 

2.3.5.  Generalizing  Further 

So  far  we  have  focused  on  the  application  program's  need  to  use  a  general-purpose  function.  To 
generalize  the  example  further,  suppose  that  the  sorting  program  generator  also  uses  some  general- 
purpose  mathematical  function  that  also  must  be  specialized  before  being  used.  Thus,  a  math  function 
generator  would  produce  a  specialized  version  of  the  math  function,  which  would  be  used  in  the 
sorting  function  generator  to  produce  a  specialized  sorting  function,  which  would  be  used  in  some 
application  program.  Again,  in  the  parlance  of  Chapter  3.  the  math  function  generator  would  be 
executed  in  phase  n-1  to  produce  and  type  check  the  specialized  math  function,  which  would  be  used 
by  the  sorting  function  generator  in  phase  n.  These  phases  are  illustrated  in  Figure  2-3. 

In  summary,  general-purpose  function  generators  can  be  used  to  produce  specialized  functions, 
which  may  themselves  be  used  by  other  general-purpose  funcuon  generators. 


Figure  2-3:  Phases  of  Sorting  Example 
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Chapter  3 

The  Conceptual  Model  of  Multiple  Phases 


This  chapter  describes  the  conceptual  model  of  multiple  strongly  typed  evaluation  phases.  Proper 
understanding  of  the  conceptual  model  is  critical  in  understanding  the  Phi  language,  the  Phi  translator, 
and  the  Implementation  Language. 

3.1.  Arriving  at  Phases  by  Extending  Compiletime 

The  conceptual  model  of  phases  is  best  understood  by  presenting  the  arguments  that  led  to  its 
development  We  begin  with  a  simple  conceptual  view  of  traditional  compileume  and  runtime,  shown 
in  Figure  3*1. 

First  a  source  program,  written  in  a  strongly  typed  language,  is  compiled  into  an  object  program. 
This  step  is  Phase  1  *•  compileume.  During  this  phase,  the  compiler  manipulates  type  values  and 
program  code,  and  as  a  result  produces  an  intermediate  object  program  that  is  guaranteed  free  of 
runtime  type  errors.16 

The  object  program,  with  a  suitable  environment,  is  then  executed  on  an  implementation  machine 
This  step  is  Phase  2  ••  runtime.  During  this  phase,  basic  values  such  as  numbers,  character  strings,  and 
boolcans  are  manipulated,  and  the  result  of  the  computation  is  some  basic  final  value  such  as  a 
number,  a  character  siring,  a  boolean,  or.  conceptually,  a  file.  The  environment  defines  all  identifiers 
that  arc  not  locally  declared  m  the  program,  that  is.  it  provides  bindings  for  all  of  the  program's  free 
variables 


The  question  of  whether  there  could  be  type  errors  in  a  program  s  input  someumes  anses  here  For  example  an  input 
operation  requinng  a  number  could  instead  be  given  vnme  meaningless  character  siring  This  problem  can  be  avoided  by  only 
providing  an  input  operation  that  alwavs  reads  characters  and  forcing  cvpe  conversion  to  be  accomplished  bv  ordinary  functions 
under  programmer  control  Thus  for  example,  the  input  sequence  "12?"  would  be  read  as  the  characters  "1"  "2"  "J"  of  known 
tvpe  and  then  convened  by  the  program  to  the  niminc  value  123 

For  simplicity,  the  simple  pedagogical  languages  described  in  this  won  do  not  include  input  or  output  operations 
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3.1.1.  Generalizing  Compiletime 


Let  us  now  view  the  compiler  as  executing  the  source  program  to  produce  the  object  program,  and 
allow  the  programmer  to  express  types  as  first-class  values  that  are  manipulated  during  compiletime. 
And  to  provide  really  useful  expressive  power,  let  us  also  allow  the  programmer  to  express  other  types 
of  values,  for  example,  numbers  and  booleans.  at  compiietime,  and  to  write  arbitrary  compiletime 
expressions  and  functions  involving  these  values. 

Now,  with  values  of  various  kinds  (numbers,  booleans.  and  of  course  types)  being  manipulated  at 
compiletime,  it  is  conceivable  that  a  so-called  ’’runtime"  type  error  could  occur  during  compiletime. 
For  example,  one  may  mistakenly  try  to  add  a  number  to  a  type  during  compiletime.  Therefore,  we 
add  another  phase  --  a  pre-compiletime  phase  --  that  does  the  type  checking  required  to  ensure  that  no 
"runtime”  type  errors  can  occur  during  compiletime. 

Our  conceptual  model,  at  this  point,  is  shown  in  Figure  3-2.  Phase  1.  the  pre-compiletime  phase, 
now  manipulates  type  values  and  program  code,  and  as  a  result  produces  a  program  that  is  guaranteed 
not  to  commit  a  "runtime"  type  error  when  executed  during  the  next  phase.  Phase  2.  compiletime, 
now  manipulates  numbers,  booleans.  type  values,  and  program  code,  and  produces  a  program  that  is 
guaranteed  free  of  runtime  type  errors.  Phase  3.  runtime,  manipulates  numbers  and  booleans  as 
before,  producing  a  final  answer  (number,  boolean,  etc.). 

3.1.2.  Generalizing  Pre-compiletime,  And  So  On . . . 

At  this  point,  we  can  make  two  observations.  First  the  pre-compiletime  phase  is  now  performing  a 
role  completel;  analogous  to  the  role  compiletime  had  plaved.  Hence,  we  can  apply  the  same 
reasoning  to  generalize  pre-compiletime.  and  add  a  pre-pre-compilenme  phase,  and  so  on.  thus 
potentially  allow  ing  an  unbounded  number  of  phases.  Each  phase  except  the  last  produces  a  type- 
checkcd  program  for  the  next  phase. 

Second,  we  observe  that  the  operations  performed  by  the  compiler  have  now  become  a  superset  of 
the  operations  performed  at  runume.  Hence  we  can  unify  the  two  so  that  one  Implementation 
language  (!L)  Machine  fills  both  roles.  The  resuiung  conceptual  modei  is  described  in  Section  3.2. 


Figure  3-2:  Pre-Compiletime  Phase  Added 
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3.2.  Conceptual  Model  of  Multiple  Phases 


Figure  3*3  illustrates  how  a  Phi  program  is  translated  and  then,  in  effect,  executed  thvough  several 
intermediate  phases  before  producing  a  final  result. 

Note  the  correspondence  between  Figure  3-3  and  the  specific  programming  method  illustrated  in 
Figure  2-2.  The  loop  shown  in  Figure  2-2  is  unfolded  in  Figure  3-3:  thus  the  conceptual  model  shows 
several  repetitions  of  the  IL  Machine  --  one  for  each -time  it  is  invoked.  Also,  for  simplicity,  the 
combine  action  in  Figure  2-2  is  not  shown  in  Figure  3-3. 

3.3.  Interpreting  the  Conceptual  Model 

A  Phi  program  is  first  translated  to  ERTr  The  Expression  component  of  ERTj  is  then  executed  on 
the  IL  Machine  in  a  suitable  environment  ENV^  -  this  is  phase  1  -  to  produce  ERTr  The  Type 
component  of  ERTj  specifies  the  type  of  value  that  will  be  produced  bv  phase  1.  For  the  first  phase,  it 
is  always  ert.  indicaung  that  another  ERT  will  be  produced.  Similarly,  the  Expression  component  of 
ERT2  is  then  executed  in  phase  2  to  produce  ERT3.  and  so  on.  The  Type  component  of  ERT2 
specifies  the  type  of  value  that  will  be  produced  as  a  result  of  phase  2.  etc.  The  result  of  some  phase  n 
is  considered  the  final  result  of  the  computation  because  it  is  not  an  ERT.  That  is.  the  Type  component 
of  ERTn  indicated  that  the  result  would  be  something  other  than  another  ERT.  Thus,  the  original  Phi 
program  could  be  viewed  as  a  meta-program  because,  in  effect,  it  denotes  a  senes  of  programs  ERTr  ... 
ERTn. 

3.3.1.  Properties  of  the  Conceptual  Model 

The  conceptual  model  has  the  following  important  properties: 

-  Each  phase  does  the  type  checking  necessar>  to  ensure  that  no  runtime  type  errors  are 
possible  dunng  the  next  phase. 

-  No  runtime  type  errors  are  possible  during  the  first  phase,  either. 

-  Every  ERT  produced  by  the  Phi  Translator  or  the  IL  Machine  is  valid  1  Specifically,  the 
Expression  component  is  guaranteed  syntactically  correct  and  tvpe  correct. 

-  The  type  of  each  subexpression  is  computed  at  least  one  phase  before  the  v  alue  of  that 
subexpression  is  computed.  Similarly,  the  type  of  the  program's  result  is  known  before  the 
program  is  phase  evaluated  (\.e.  it  is  given  as  the  Type  component  of  an  ERTi. 

As  defined  in  Section  2  2  2 
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Figure  3-3:  Conceptual  Model  of  Multiple  Phases 
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-  Each  phase  acts  as  compiletime  for  the  next  phase,  and  as  runtime  for  the  prev  ious  phase. 

Thus,  the  terms  "compiletime"  and  "runtime"  are  relative.  These  terms  will  still  be  used  in 
the  rest  of  this  work  -*  they  are  still  meaningful  terms  --  but  the  reader  should  recognize 
that  their  meanings  are  relative  to  other  implied  runtime  or  compiletime  phases. 

-  The  Phi  Translator  does  no  type  checking  --  it  will  produce  a  valid  ERT,  free  of  possible 
runtime  tvpe  errors,  for  any  syntactically  legal  Phi  program.  This  is  explained  in  Section 
3.3.3. 

3.3.2.  Resolving  the  Conflict  Between  "Strong  Typing"  and  "Types  as  First*Class 
Values" 

Section  1.2.3.3  points  out  the  inherent  conflict  between  strong  typing  and  the  desire  for  types  as 
first-class  values.  In  our  model  of  multiple  phases,  types  are  indeed  allowed  as  first-class  values,  yet 
every  phase  is  strongly  typed.  How  is  the  conflict  avoided  in  our  model? 

In  general,  types  manipulated  as  first-class  values  during  one  phase  become  invariants  of  the  next 
phase,  in  the  sense  that  a  type  used  in  a  declaration  represents  an  inv  ariant.  If  an  identifier  is  declared 
to  be  some  type,  that  type  represents  an  invariant  on  the  kinds  of  value  that  may  be  bound  to  that 
identifier.  Similarly,  if  a  function’s  return  value  is  declared  to  be  a  certain  type,  that  type  represents  an 
invariant  on  the  kinks  of  value  that  the  function  may  return. 

In  our  model  it  is  not  possible  to  use  a  type,  computed  as  a  first-class  value,  as  an  invariant  of  the 
same  phase  during  which  it  was  computed.  Type  values  computed  in  one  phase  have  no  bearing  on  the 
types  of  the  expressions  executed  during  that  same  phase.18  One  can  compute  an  arbitrary  type  value 
during  one  phase,  but  that  type  value  can  only  be  used  in  declarations  pertaining  to  subsequent  phases 
--  not  in  the  declarations  pertaining  to  that  same  phase.19  For  example,  in  the  same  phase,  one  cannot 
both  compute  the  type  used  to  declare  an  identifier.  <Wbind  a  value  of  that  type  to  the  identifier.  The 
type  of  the  identifier  must  be  computed  during  at  least  one  phase  before  the  identifier  may  be  bound  to 
a  value  of  that  type.  Thus,  the  notion  of  separate  phases  prevents  any  possible  circular  dependency 
between  an  object's  type  and  its  value. 

18 

Thu  proper!)  u  readi!)  evident  in  Stauc-IL  presented  in  Section  a  2  Expressions  in  Siauc-!L  are  tvpe  checked  and 
generated  in  the  form  of  ERTs.  and  in  doing  so.  types  are  computed  as  first-class  values  However  there  is  no  construct  .r, 
Stauc-IL  for  e\otuann%  an  ERT  That  is  there  is  no  provision  for  invoking  the  Stauc-IL  Machine  from  within  Siatic-IL  Hence 
there  is  no  wav  for  the  type  values  computed  in  one  phase  to  have  any  effect  on  the  tvpcs  of  the  identifiers  or  expression; 
evaluated  dunng  that  same  phase 

19 

Conceivably  the  type  ma>  even  be  computed  b'  a  recurwe  funcuon.  as  mentioned  in  Section  6  5  chough  for  simphc:; 
recursive  functions  are  not  provided  in  the  Stanc-Phi  and  Stauc-IL  languages  described  in  Chapter  4 


3.3.3.  The  Paradox  of  Strong  Typing  Without  Prior  Type  Checking 


We  mentioned  that  every  phase  is  strongly  typed,  and  that  the  1L  program  for  every-  phase  -*  except 
the  first  --  is  type  checked  by  the  previous  phase.  We  reouire  that  the  first  phase  also  be  strongly  typed, 
yet  we  also  mentioned  that  the  Phi  Translator  does  no  type  checking.  How  can  we  ensure  that  the  IL 
program  produced  by  the  Phi  translator  does  not  contain  any  runtime  type  errors  if  the  IL  Translator 
does  no  type  checking?  The  answer  is  simple:  the  translator  produces  an  IL  program  in  which  every 
subexpression  evaluates  to  a  value  of  the  same  type:  type  ERT. 

This  means  that  the  only  operations  performed  during  the  first  phase  are  manipulations  of  program 
fragments.  This  makes  sense  when  one  considers:  what  if  it  weren’t  true.  That  is.  suppose  some  other 
operation  --  addition  of  two  numbers,  say  --  could  be  performed  during  the  first  phase.  Then,  to 
guarantee  that  this  operation  could  not  involve  a  runtime  type  error,  there  would  either  have  to  be 
another  previous  phase  or  the  translator  would  have  to  do  some  type  checking. 

Hence,  every  variable  is  type  ERT  initially,  and  every  IL  program  produced  by  the  translator 
evaluates  to  an  ERT  (assuming  no  compiletime  errors  occur  during  evaluation).  (If  a  compileume 
error  does  occur  during  evaluation,  the  program  can  either  be  thought  of  as  returning  some  special 
error  value,  distinct  from  all  other  values,  or  as  returning  nothing,  since  the  evaluation  is  aborted.) 

3.3.4.  All  Expressions  Start  Out  Type  ERT 

If  we  view  the  IL  programs  ERTj...ERTn  in  Figure  3-3  as  representing  successive  versions  of  the 
initial  Phi  program,  then  the  type  of  every  subexpression  or  variable  in  the  initial  Phi  program  starts 
out  as  ERT.  and  remains  ERT  until  some  phase  when  it  becomes  fixed  as  some  basic  type,  such  as  a 
number  or  a  boolean  (any  type  other  than  ERT).  Finally,  during  the  following  phase,  the  expression  or 
v  ariable  will  have  a  value  of  that  type  (number  or  boolean). 

This  one-way  progression  represents  the  accumulation  of  information  about  the  expression  or 
variable.  Type  ERT  means  that  nothing  is  known  about  the  expression  or  variable.  Then,  during  some 
phase,  the  type  of  the  expression  or  variable  is  known  (number  or  boolean,  for  example).  Finally, 
during  the  next  phase,  the  specific  vaiuc  of  the  expression  or  variable  is  computed.  This  subject  is 
mentioned  further  in  Section  7.2.8. 
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3.4.  Assigning  Computations  to  Phases 

Given  a  source  program,  we  need  some  way  to  decide  during  what  phases  its  various 
subcomputations  should  be  performed.  For  example,  we  require  that  the  type  of  a  function's  formal 
parameter  be  computed  at  least  one  phase  before  the  function  can  be  applied  to  any  actual  arguments. 
There  are  two  basic  approaches  we  can  take;  the  first  of  these  has  two  variations. 

1.  Static  determination.  The  phase  for  each  computation  is  fixed  during  translation,  before 
the  first  phase.  This  approach  most  closely  follows  the  reasoning  presented  in  Section  3.1, 
which  led  to  the  idea  of  multiple  strongly  typed  evaluation  phases,  and  this  is  the  approach 
on  which  this  work  was  initially  based.  There  are  two  sub-options  possible  under  this 
approach: 

a.  The  source  program  can  explicitly  indicate  which  computations  are  to  be  performed 
during  each  phase.  This  was  the  original  approach  conceived  as  "multiple  strongly 
typed  evaluation  phases",  and  is  described  in  Section  4. 

b.  The  Phi  Translator  might  infer  w  hich  computations  should  be  performed  during  each 
phase.  This  approach  was  not  pursued  in  this  work.  We  do  not  know  how  difficult 
this  alternative  might  be,  or  what  problems  it  might  present.  It  is  open  for  future 
research,  as  mentioned  in  Section  7.2.9. 

2  Dynamic  determination.  The  phase  for  each  computation  is  determined  during  the  various 
execution  phases,  and  depends  on  the  environments  supplied  during  the  previous  phases. 

This  would  allow  phases  to  achieve  the  effect  of  partial  evaluation,  because  the  types  and 
values  of  different  free  variables  could  be  "fixed"  as  desired  during  different  phases.  The 
essential  distinctions  between  this  kind  of  phase  evaluation  and  partial  evaluation  are  that, 
under  phases,  the  same  machine  would  be  used  to  perform  "partial"  and  "full"  evaluation, 
there  is  a  rigid  requirement  of  strong  typing  in  each  phase,  and  the  type  of  result  --  either 
the  final  answer  or  another  program  -  would  be  known  in  advance. 

This  approach  has  not  been  fully  explored,  but  the  possibility  is  discussed  in  Chapter  5. 

3.5.  How  Many  Phases  Are  Required? 

How  many  phases  will  be  required  to  execute  a  gnen  Phi  program  to  a  final  answer?  In  general,  the 
answer  depends  on  the  program,  whether  a  mode!  of  static  or  dynamic  determination  of  phases  is  used, 
and  might  depend  on  the  environments  provided  in  the  various  phases. 

Any  given  program  will  always  require  some  minimum  number  of  phases  before  it  can  produce  a 
final  answer.  For  the  Static-Phi  language  described  in  Section  4.1.  an  algorithm  [Count,  defined  in 
Appendix  A)  is  used  to  compute  this  minimum  based  on  the  lexical  nesung  level  of  emits  and  evals  in 
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the  original  Static-Phi  program,  and,  hence,  it  cannot  be  infinite  for  a  finite-sized  program.20  For 
example,  the  program  demonstrated  in  Section  4.4.3  requires  three  phases,  whereas  the  program  in 
Section  4.4.9  requires  four  phases. 

What  about  using  more  phases  than  the  minimum?  When  phases  are  determined  statically,  there  is 
little  flexibility  for  extra  phases,  because  a  given  program  would  expect  certain  inputs,  via  the 
environments,  in  certain  phases.  (Section  4.3  discusses  environments  for  Static-Phi.) 

If  phases  were  determined  dynamically,  with  each  phase  performing  the  function  of  partial 
evaluation,  then  extra  phases  might  freely  be  used.  Partial  evaluation  is  defined  to  preserve  the 
semantics  of  the  original  program,  so  extra  phases  should  certainly  cause  no  harm,  and  they  may 
improve  the  efficiency  of  later  phases  by  allowing  the  values  of  some  expressions  to  be  pre-computed. 
Of  course,  if  there  are  no  more  expressions  that  can  be  pre-computed,  adding  an  extra  phase  does 
nothing  useful.  As  a  trivial  example,  consider  the  program  consisting  only  of  the  variable  x.  If  no  final 
value  is  given  for  x,  x  will  just  partially  evaluate  to  itself.  That  is,  the  program  will  be  partially 
evaluated  perfectly  well,  but  no  useful  work  will  be  done  because  no  further  reduction  is  possible  until 
a  final  value  is  supplied  for  x. 
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Chapter  4 

Static  Determination  of  Phases 


This  chapter  informally  describes  the  originally  conceived  system  of  multiple  phases,  in  which  the 
phase  for  a  particular  subcomputation  is  explicitly  denoted  in  the  source  program.  That  is.  the  phase 
for  a  given  subcomputation  is  determined  statically  during  translation.  We  demonstrate  this  approach 
by  defining  a  source  language,  Static-Phi,  a  Static-Phi  Translator,  and  an  implementation  language, 
Static-IL.  More  precise  semantic  definitions  are  given  in  Appendix  A. 

The  reader  well-versed  in  the  typed  lambda  calculus  may  wish  to  skim  Section  4.1.  which  describes 
Static-Phi,  noting  the  special  emit  and  eval  constructs,  and  then  turn  directly  to  Section  4.2,  which 
describes  Static-IL.  Section  4.2.3  is  important  because  it  discusses  the  unusual  language  constructs  in 
Static-IL.  Finally,  the  reader  is  strongly  urged  to  read  the  discussion  of  the  two  examples  in  Sections 
4.4.3  and  4.4.9  to  gain  an  appreciation  of  how  phases  work. 

4.1.  The  Static-Phi  Language 

The  Static-Phi  language  is  expression  oriented,  and  looks  like  a  simple  typed  lambda 
calculus  [Barendregt  84]  with  two  extra  constructs  added.  Types  are  unrestricted  first-class  values; 
wherever  a  type  is  required,  any  arbitrary'  expression  that  evaluates  to  a  type  may  be  given.  There  is  no 
modifiable  store,  or  assignment  operation.  There  is  one  abstraction  operator.  for  data  abstraction, 
function  abstraction,  type  abstraction,  and  code  (ERT)  abstraction. 

4.1.1.  Conventional  Static-Phi  Language  Constructs 

The  Static-Phi  language  includes  the  follow  ing  basic  forms: 

constant  A  literal  constant,  for  example,  a  number  1234.  a  until  value  false,  or  a  type  constant 

number,  bool.  ert.  or  type  Ty  pe  constant  type  refers  to  the  type  of  types:  ert  is  the 
type  of  ERTs.  described  in  Section  2.2.1. 

id  An  idenufier  (variable).  An  identifier  always  evaluates  to  the  value  bound  to  it  m 

the  environment. 


\id\exprt-~  exprr .  exprbo6v 

For  creating  an  unnamed  function  abstraction.  Expr^  and  exprf  are  arbitrary 
expressions  that  must  evaluate  to  types;  they  declare  the  types  of  the  domain  and 
range  of  the  function,  that  is.  ex pr^  is  the  type  of  the  formal  parameter  id,  and  exprf 
is  the  type  of  the  function's  return  value.  Expr body  is  the  body  of  the  function. 
Because  we  require  "compiletime"  type  checking  (that  is,  one  phase  before 
"runtime"),  the  formal  parameter  type  will  be  evaluated  one  phase  before  the 
function  value  (closure)  is  created.  That  is.  if  the  function  is  to  be  applied  in  phase  i, 
the  type  of  the  formal  parameter  will  be  computed  in  phase  /-  /. 

(exprf  expr%)  Function  application.  Exprf  is  an  arbitrary  expression  that  must  evaluate  to  a 
function;  exprx  will  evaluate  to  the  actual  argument  The  type  of  the  actual 
argument  must  match  the  declared  type  of  the  formal  parameter  for  the  function; 
this  is  checked  during  the  phase  before  the  function  is  applied.  The  function 
application  always  occurs  during  the  same  phase  that  the  actual  function  value  is 
created,  regardless  of  any  nesting  inside  emits  or  evals  (described  below). 

(funtvpe  expr6  exprf ) 

Standard  function  for  constructing  the  types  of  functions.  The  subexpressions  are 
evaluated  (they  evaluate  to  types)  and  paired  to  represent  the  types  of  the  domain 
and  range  of  a  function.  Expra  is  the  domain  type;  ex prr  is  the  range  type.  Of 
course,  both  subexpressions  must  be  type  type;  this  is  checked  one  phase  before  the 
function  type  is  to  be  constructed  and  returned. 


4.1.2.  Normal  Runtime  Phase 

Normal  runtime  phase  refers  to  the  phase  in  which  a  particular  operation  is  actually  performed  (as 
opposed,  say.  to  the  phase  in  which  the  operation  is  type  checked).  Within  a  single  program  the 
normal  runtime  phase  will  be  different  for  different  instances  of  different  operations.  For  example,  the 
type  expression  for  a  function’s  formal  parameter  might  use  an  operation  that  is  also  used  in  the  body 
of  the  function.  Used  in  the  formal  parameter  type  expression,  the  operation's  normal  runtime  phase 
w  ill  be  one  phase  sooner  than  for  the  instance  of  the  operation  that  appears  in  the  body  of  the  function. 
"Normal  runtime  phase"  is  usually  used  as  a  comparative  term,  to  contrast  the  different  phases  when 
tw  o  operations  are  performed. 


By  altering  the  normal  runtime  phase  of  an  operation,  one  can  cause  the  operation  to  be  performed 
during  some  phase  earlier  or  later  than  it  would  otherwise  be  performed.  Basically  ,  if  an  operation  is 
used  to  compute  a  type  that  will  be  used  to  type  check  a  subsequent  phase,  then  one  would  want  the 
normal  runtime  phase  of  the  operation  to  be  one  phase  earlier  than  it  otherwise  would  be.  Or.  if  an 
operation  is  used  to  explicitly  generate  some  code  (an  ERT)  that  is  to  be  executed  in  a  later  phase  (as 
with  macro  expansion),  then  one  would  also  want  the  normal  runume  phase  of  the  operation  to  be  one 
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phase  earlier  than  it  otherwise  would  be.  On  the  other  hand,  if  the  operation  in  question  were  a  part  of 
the  generated  code,  one  would  want  its  normal  runtime  phase  to  be  one  phase  later:  that  is.  the  normal 
runtime  phase  of  the  operations  that  are  doing  the  generation  should  be  one  phase  earlier  than  the 
normal  runtime  phase  of  the  operations  in  the  generated  code. 

The  normal  runtime  phase  of  a  construct  is  altered  in  three  ways:  by  being  inside  an  emit  (discussed 
below),  by  being  inside  an  eval  (also  below),  or  by  being  in  a  function  abstraction's  range  or  domain 
type  expression.  The  normal  runume  phase  for  a  function  abstraction's  range  or  domain  type 
expression  is  implicitly  one  phase  earlier  than  the  normal  runtime  phase  for  the  function,  since  the 
function  must  be  type  checked  during  the  phase  before  it  is  applied.  Emit  and  evai  are  used  to 
explicitly  change  the  normal  runtime  phase  of  an  expression:  eval  makes  the  normal  runtime  phase 
one  phase  earlier,  while  emit  makes  it  one  phase  later.  These  are  discussed  below  m  Section  4.1.3,  and 
are  more  precisely  defined  in  the  formal  semantics  given  in  Appendix  A. 

4.1.3.  Some  Unusual  Constructs 

In  addiuon  to  the  familiar  constructs  outlined  in  Section  4.1.1,  Static-Phi  also  includes  the  follow  ing 
unusual  forms. 

(eval  expr)  The  normal  runtime  phase  of  expr  is  one  phase  earlier  than  in  the  surrounding 
context.  Note  that  the  domain  and  range  type  expressions  in  the  A  construct.  exprQ 
and  exprr,  are  effectively  inside  an  implicit  eval.  because  the  types  need  to  be 
computed  one  phase  before  the  function  value  (closure)  is  created. 

(emit  expr)  The  normal  runume  phase  of  expr  is  one  phase  later  than  m  the  surrounding 
context. 

Note  that  our  eval  is  very  different  from  the  LISP  EVAL.  Our  emit  and  eval  forms  are  only  used 
during  translation.  They  are  not  executable  notions,  and  there  are  no  Siauc-ll  syntactic  forms  that 
correspond  to  them. 

Note  also  that  emit  and  eval  cancel  each  other  out.  in  a  manner  analogous  to  he  LISP  back-quote 

( . )  and  comma  (",")  macro  constructs.  Thus,  (emit  (eval  expr)).  (eval  (emit  expr)).  and  expr  are 

entirely  equivalent  ir.  Static-Phi. 
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4.1.4.  Examples 


This  section  shows  some  simple  examples  of  Static-Phi  programs.  Section  4.4  shows  how  each  of 
these  examples  would  be  translated  to  Static- IL  programs  and  appear  in  various  phases.  The 
explanations  of  the  identity  function  examples  in  Sections  4.4.3  and  4.4.9  give  the  flavor  of  what  the 
various  phases  do.  We  begin  here  with  trivial  examples  and  work  up  to  more  interesting  cases. 


4.I.4.I.  F  Twice 


(f(fx)) 

Some  function  /is  applied  twice  to  an  argument  x. 


4. 1.4.1  Identity  Abstraction 


A  x  :  number  number .  x 

An  unnamed  identity  function  that  takes  a  number  and  returns  that  same  number. 


4. 1.4.3.  Identity  Application 


(A  x  :  number  -»  number .  x  5) 

The  identity  function  from  the  previous  example  is  applied  to  the  number  5.  The  final  result  will  be 


5. 


4. 1.4.4.  Function  Abstraction 


A  x  :  number  -*  number  .  (succ  (succ  x)) 

If  succ  is  the  successor  function  on  numbers,  defined  in  the  environment,  this  is  an  unnamed 
function  that  adds  2  to  its  argument. 


4. 1.4.5.  Function  Application 


(A  \  :  number -*  number .  (succ  (succ  x))  5) 

The  function  from  the  previous  example  is  applied  to  5.  The  final  result  will  be  7. 


4. 1.4.6.  Higher  Order  Function  Abstraction 


A  f :  (funtype  number  number)  -*  number .  (f  (f  x)) 

This  function  takes  another  function  as  an  argument  and  applies  it  tw  ice  to  some  free  variable  x.  The 
actual  parameter  must  be  a  function  from  numbers  to  numbers. 
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4. 1.4.7.  Higher  Order  Function  Application 

(A  f :  (funtype  number  number)  -*  number .  (f  (f  x))  g) 

The  higher  order  function  from  the  previous  example  is  applied  to  g ,  which  must  be  a  function  from 
numbers  to  numbers.  Thus,  function  g  is  applied  twice  to  the  free  variable  x.  The  program  is 
equivalent  to: 

(g  (g  x)) 

4.1.4.8.  Identity-Function  Type  Abstraction 

A  t :  type  -*  ert .  (emit  Ax:t-*t.x) 

This  function  takes  a  type  i  and  returns  code  (an  ERT)  that  will  become  an  identity  function  in  the 
next  phase.  The  generated  identity  function  will  be  specialized  for  type  /,  and  may  only  be  applied  to 
values  of  type  t. 

There  are  two  function  abstractions  in  this  example:  the  outer  function  abstracts  the  type  variable  t  in 
one  phase,  and  the  inner  function  abstracts  the  variable  x  in  the  next  phase.  Note  the  emit  surrounding 
the  mner  function  abstraction.  The  emit  informs  the  Static-Phi  Translator  that  the  inner  function 
abstraction  is  to  be  created  one  phase  later  than  the  outer  function  abstraction.  This  is  required 
because  the  outer  function  abstraction  is  manipulating  a  type  value  that  will  be  used  in  type  checking 
the  inner  function.  Hence,  during  the  phase  when  the  outer  function  is  created  and  applied,  the  inner 
function  is  just  treated  as  code  (an  ERT),  and  is  type  checked.  The  outer  function  is  acung  like  a  macro 
in  returning  the  code  (an  ERT)  instead  of  returning  a  function  \alue  (or  closure). 

The  outer  function  cannot  both  compute  the  type  /  as  a  first-class  value  and  return  the  inner  function 
as  a  function  value  (closure)  during  the  same  phase,  because,  to  enforce  strong  typing  the  inner 
function  must  be  type  checked  during  the  phase  before  it  is  used  as  a  function  value  Therefore,  if  the 
emit  were  omitted,  a  compiletime  error  would  occur  when  the  inner  function  was  being  type  checked. 

4. 1.4. 9.  Identity-Function  Type  Application 

(A  t :  type  -*  ert .  (emit  Ax:t-*t.\)  number) 

The  identity-function  generator  of  the  previous  example  is  applied  to  type  number  to  generate  an 
idenutv  function  from  numbers  to  numbers. 


4.1.4.10.  General  Type  Abstraction 


X  t :  type  -►  ert .  (emit  X  f :  (funtype  1 1)  — ►  t .  (f  (f  x))) 

"This  function  takes  a  type  /  and  returns  code  (an  ERT)  that  will  become  a  function  in  the  next  phase. 
The  generated  function  will  take  any  function  from  t  to  t  and  apply  it  twice  to  the  free  variable  x. 

This  example  demonstrates  how  types  may  be  manipulated  as  first-class  values  during  one  phase,  yet 
become  invariants  of  the  next  phase.  The  outer  X  creates  a  function  that  takes  (and  could  manipulate) 
a  type  as  a  first-class  value.  However,  it  returns  code  (an  ERT)  that  has  been  type  checked  using  this 
type.  This  returned  code  happens  to  be  the  code  for  a  function  abstraction.  (Incidentally,  free  variable 
x  is  also  type  checked  when  the  ERT  for  the  function  is  generated  and  type  checked.)  Section  4.4.10 
shows  how  this  example  would  appear  in  various  phases. 

4.1.4.11.  General  Type  Application 

(X  t :  type  -*  ert .  (emit  X  f :  (funtype  1 1)  — *  t .  (f  (f  x)»  number) 

The  ERT-retuming  function  of  the  previous  example  is  applied  to  type  number. 

4.1.4.12.  Macro  Abstraction 

X  m  :  ert  -*  ert .  (m  (m  x)) 

This  function  takes  some  code  m  (an  ERT)  and  returns  code  that  applies  m  twice  to  some  free 
variable  x. 

Note  that  the  formal  parameter  m  and  the  function's  return  type  are  both  ert.  indicating  that  this 
function  will  take  code  (an  ERT)  as  its  argument  and  return  code  (an  ERT)  as  its  result.  This  function 
manipulates  code,  much  like  a  macro. 

4.1.4.13.  Macro  Application 

(  X  m  :  ert  -*  ert .  (m  (m  x)) 

(emit  X  v  :  number  -*  number .  (succ  (succ  v))) 

) 

The  macro  of  the  previous  example  is  applied  to  code  which  will  become  a  function  to  add  2  to  its 
argument.  Note  that  the  actual  argument  is  surrounded  by  an  emit  so  that  the  (macro)  function  of  the 
previous  exampie  will  operate  on  it  as  code  (an  ERT)  rather  than  as  a  function  value  (or  closure). 
Thus,  the  outer  function  treats  the  inner  function  as  code  during  one  phase,  and  the  inner  function 
becomes  a  function  value  (closure)  during  the  next  phase.  If  the  emit  were  omitted,  the  outer  function 
could  operate  on  the  inner  function  only  as  a  function  value  (closure),  not  as  code.  Section  4.4.13 
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shows  the  IL  code  that  results  from  translating  and  executing  this  example  through  the  necessary 
phases. 

4.2.  The  Static-IL  Language 

As  shown  in  the  conceptual  model  (Figure  3-3),  a  Phi  program  is  not  executed  directly,  but  is  first 
translated  into  a  corresponding  Static-IL  program.  The  translator  is  defined  in  Appendix  A.  though 
examples  of  translation  are  given  in  Section  4.4.  This  section  describes  the  Static-IL  language,  which 
includes  some  unusual  language  constructs  for  creating  and  combining  type-checked  program 
fragments  in  the  form  of  ERTs. 

Syntacucally,  Static-IL  looks  like  an  untyped  lambda  calculus.  In  fact  Static-IL  is  typed,  though  type 
declarations  are  not  explicit.  Under  the  programming  method  shown  in  Figure  2-2,  the  Static-IL 
Machine  is  given  only  Static-IL  expressions  that  are  guaranteed  free  of  runtime  type  errors  or  unbound 
variables,  and  it  generates  Static-IL  expressions  only  within  valid  ERTs.21  Since  a  vaiid  ERT  triplet 
includes  a  list  of  all  the  Static-IL  expression's  free  variables  and  their  types,  and  the  type  of  the 
expression,  Static-IL  expressions  should  be  regarded  as  typed.22  We  speak  of  Static-IL  expressions  as 
being  well  typed  in  the  same  sense  that  one  would  speak  of  the  object  code  for  a  compiled  Pascal 
program  as  being  well  typed,  even  though  the  type  information  from  the  source  program  is  stripped 
out  after  being  checked,  when  the  object  code  is  generated. 

4.2.1.  Lexical  Scoping,  ERTs.  and  Macros 

Static-IL  expressions  are  lexically  scoped.  Nonetheless,  if  ERTs  are  explicit])  manipulated  by  the 
programmer,  just  as  with  conventional  macros,  it  is  possible  to  generate  new  expressions  in  which  free 
variables  have  become  "captured”  by  local  declarations.  Note  that  this  is  possible  only  in  program 
fragments  (ERTs)  that  are  explicitly  being  constructed,  as  first-class  data  objects.  When  an  expression 
is  executed,  that  expression  is  absolutely  lexically  (or  statically)  scoped,  and  nc  such  anomalies  are 
possible. 


*n 

\  ahd  LRT  was  defined  in  Section  2  2  2 

A  1 

"John  Mitchell  and  David  MacQueen  have  pointed  out  that  n  mav  be  better  10  retard  the  implementation  laneuafe  as 
convAung  of  the  enure  ERT  triplet  (rather  than  just  the  Expression  component),  since  the  R  (Reautred-environmenU  and  T 
(T;,pci  components  of  the  ERT  tnplei  contain  the  ivpe  informauon  for  ihe  E  l Expression!  component 
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The  examples  below  illustrate  how.  in  constructing  an  expression  by  manipulating  ERTs  as  first-class 
values,  a  variable  can  appear  to  become  "captured",  as  with  macros.  In  the  following  Static-Phi 
program,  x  will  evaluate  to  the  ERT  representing  the  outer  z,  thus  causing  the  outer  z  to  be  placed  into 
the  scope  of  the  inner  z. 


X  z :  number  -►  (funtype  number  number) . 
(eval 

(X  x  :  ert  -►  ert . 

(emit  X  z :  number  -*  number .  x) 


For  example,  the  following  Static-Phi  program  (which  simply  supplies  actual  parameters  for  the 
functions  in  the  preceding  program). 


(X  z  :  number  -*  (funtype  number  number) . 
(eval 

(X  x  :  ert  -*  ert . 

(emit  X  z  :  number  -*  number .  x) 


will  be  phase  evaluated  to  produce  the  following  Static-IL  program, 

(apply  (apply  (lambda  z  (lambda  z  z))  (quote  5))  (quote  10)) 

which  evaluates  to  10. 

The  behavior  illustrated  above  is  quite  intentional  -  it  was  not  an  oversight  --  though  it  is  different 
than  one  might  naively  expect.  The  explicit  intent  here  is  to  manipulate  program  fragments  (ERTs)  to 
construct  new  programs  with  new  semantics.  This  behavior  is  useful  for  program-writing  programs, 
and  is  analogous  to  the  behavior  of  conventional  macros.  Also,  bear  in  mind  that  under  no 
circumstances  can  this  behavior  cause  a  runtime  type  error.  Any  attempt  to  cause  a  type  mismatch  in 
the  constructed  code  will  be  detected  as  a  compiletime  error,  one  phase  before  the  constructed  code 
can  be  executed. 
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4.2.2.  Conventional  Lambda  Calculus  Operations 


The  following  Static-IL  primitives  look  and  function  exactly  like  the  basic  operations  of  an  untyped 
lambda  calculus,  written  in  the  style  of  LISP: 


e\ 


id 


Any  quoted  expressible  value.  The  value  ev  is  simply  returned,  unevaluated.  In 
Static-IL,  constants  appear  as  explicitly  quoted  values. 

An  identifier.  Its  value  is  simply  retrieved  from  the  environment. 


(lambda  id  expr )  Function  abstraction.  Id  is  the  formal  parameter,  expr  is  the  function  body  A 
lambda  abstraction  evaluates  to  a  closure,  consisting  of  the  current  environment  the 
formal  pammeter,  and  the  function  body. 

(apply  ex prf  expr x  ) 

Function  application.  Exprf  evaluates  to  a  function  closure;  expr*  is  evaluated  and 
becomes  the  actual  argument.  The  function  application  has  already  been  type 
checked  during  the  previous  phase. 

(fun type  expra  exprr  ) 

This  operation  is  used  to  generate  the  type  of  a  function.  Subexpressions  expr^  and 
exprr  are  simply  evaluated  in  the  current  environment:  they  evaluate  to  types. 
These  types,  type,  and  typer,  are  used  as  the  domain  and  range  types  of  the  function 
type  that  is  returned. 

The  returned  function  type  is  represented  as  a  pair,  tagged  with  the  word  fun: 
<fun  type(J,  typer>.  For  example.  <fun  number  number>  represents  the  type  of  a 
function  that  takes  a  number  and  returns  a  number. 

(  mcr  expr )  Increment.  This  operation  returns  the \  alue  of  the  expression  plus  one.  There  is  no 
corresponding  operauon  m  Static-Phi:  mcr  is  only  included  in  Stauc-IL  to  make 
the  examples  in  Sections  4.1.4  and  4.4  more  interesting.  (In  the  examples  of  Section 
4.4.  mcr  is  used  to  implement  the  succ  function,  which  is  assumed  to  be  supplied  in 
the  environment.) 

These  operations  are  not  discussed  further  here 

4.2.3.  Some  Unusual  Operations 

!  he  purpose  of  the  com-  er.iior.a!  Static-!!  operations  listed  obov  e  is  to  cc  core,  er.tional  computations 
-  to  manipulate  basic  values  as  in  a  lambda  calculus.  The  only  perceptible  difference  is  that  types  are 
ulso  manipulated  along  *  nh  other  basic  v  alues. 

In  contrast,  the  rest  of  Static- ll.  s  primitive  operations  do  not  look  so  conventional  Their  ultimate 
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purpose  is  to  produce  type-checked  program  fragments  (ERTs).  That  is.  the  ultimate  purpose  of  the 
operations  listed  below  is  to  type  check  and  generate  the  conventional  operations  listed  above.  All  of 
the  Static-IL  constructs  discussed  below  return  ERTs  as  their  result.  Several  of  them  involve  a 
parameter  n,  which  is  a  constant  determined  during  translation,  that  indicates  how  many  phases  to 
wait  before  generating  one  of  the  conventional  operations.  The  Static-Phi  translator  uses  the  emits  and 
evals  to  determine  during  what  phase  each  of  the  various  conventional  operations  should  occur,  and 
generates  the  Static-IL  program  with  the  corresponding  ns.  The  examples  in  Section  4.4,  and  in 
particular  the  two  examples  in  Sections  4.4.3  and  4.4.9.  demonstrate  what  happens  in  successive  phases, 
how  these  language  constructs  work,  and  the  purpose  of  these  n  parameters. 

4.2.3.1.  (deep-const  C  t  n) 

This  construct  always  returns  an  ERT.  Its  purpose  is  to  generate  a  type-checked  quoted  constant  in 
the  proper  phase,  i.e.  a  Static-IL  program  of  the  form  '  ev.  C  is  any  constant  value,  i  is  its  type,  and  n  is 
the  number  of  phases  to  wait  before  the  constant  is  needed.  Deep-const  can  be  thought  of  as  deeply 
quoting  the  constant.  (Constants  are  not  assumed  to  be  self-quoting.) 

The  operation  deep-const  is  evaluated  as  follows.  If  n  >  0,  the  ERT 

< t deep-const  c  i  n- 1 ) .  <>.  ert>  is  returned:  otherwise,  (when  n  =  0).  the  ERT  <  c.  <>.  />  is 
returned.  The  idea  is  (hat  each  ume  deep-const  is  evaluated,  it  basically  just  decrements  n,  returning 
the  same  kind  of  ERT  until  n  reaches  0.  When  n  reaches  0,  then  an  ERT  containing  the  quoted 
constant  and  its  type  is  returned.23 

4. 2.3.2.  (check-f untype  expr^  exprr  n) 

Check-funtype  always  returns  an  ERT.  It  is  used  to  generate  an  ERT  containing  a  funtype 
expression  as  its  Expression  component,  when  n  is  0. 

Both  expr  and  exprr  will  evaluate  to  ERTs:  cal!  them  <e  ,r  t  >and  <’er.rr.tr>. 

Let  us  first  consider  the  case  when  the  number  n  is  0.  in  which  case  a  funtype  ERT  will  be  returned. 
Both  >a  and  /f  must  be  the  type  constant  type,  indicating  that  or  and  ea  will  evaluate  to  tv  pcs  during 
the  next  phase.  (It  is  a  "compiletime"  error  if  either  ;  or  ,'r  are  not  type.)  Next,  the  Expression 


"  Note  that  the  constants  t\pe  is  hidden  until  the  phase  before  the  constant  us  used  e'en  though  the  t\pc  is  determined 
‘> nucticall)  b>  the  onginal  Suuc-Phi  program  Phis  mean1'  that  tf  one  t>pe  of  constant  i>  written  wnerc  some  other  npe  is 
reqjtred  the  t\pc  mismatch  will  not  be  dtscosered  tin  til  the  pha-c  before  the  salue  ol  the  constant  would  ha'c  beer,  used  e'er, 
though  cenainh  would  be  bcr.cr  to  repon  the  error  as  earit  as  possible  This  issue  is  mentioned  further  in  Section  '  2  S 


components  ee  and  er  are  used  to  build  the  Expression  component  of  the  ERT  that  check-funtyp*  will 
return.  For  example,  if  e6  and  er  are  '  number  and  ’  number,  eheck-f  untype  will  return  an  ERT  with  the 
expression  component  (fun  type  'number  'number). 

Similarly,  the  resulting  ERTN  Required-environment  is  formed  by  combining  the  Required- 
environments  rd  and  rp.  This  means  that  the  free  variables  of  the  resulting  expression  include  the  free 
variables  of  both  of  its  subexpressions  e6  and  er  However,  the  Required-environments  must  be 
£  consistent:  if  a  variable  appears  in  both,  it  must  have  the  same  type,  otherwise  it  is  a  "compiletime" 

error. 

Finally,  the  Type  component  of  the  resulting  ERT  will  be  type  *-  f  untype  always  returns  a  type. 

If  n  >  0.  then  this  is  not  the  right  phase  to  generate  a  funtype  expression;  instead,  another 
check-f untype  expression  will  be  generated,  and  n  will  be  decremented,  as  for  deep-const.  In  this 
case,  both  tQ  and  tr  must  be  the  type  constant  ert,  indicating  that  er  and  ed  will  evaluate  to  ERTs 
during  the  next  phase.  The  resulting  ERT  will  be  constructed  in  a  manner  similar  to  the  case  when  a 
funtype  expression  is  generated,  except  that  the  Type  component  of  the  resulting  ERT  will  be  ert 
instead  of  type. 

4.13.3.  (check- lambda  id  expr^  exprf  ex prbody  ) 

The  check-iambda  construct  is  analogous  to  the  check-funtypa  construct:  it  is  used  to  generate  a 
lambda  StaticTL  expression,  and  it  always  returns  an  ERT,  However,  check- lambda  differs  from 
check-funtype  in  two  important  ways:  it  has  a  bound  variable,  id:  and  two  of  its  subexpressions,  ex pr^ 
and  exprr.  evaluate  to  types,  while  the  other,  expr^^.  evaluates  to  an  ERT. 

The  check -lambda  construct  is  evaluated  as  follows.  First,  the  type  expressions  expr^  and  exprr  are 
evaluated  in  the  current  environment:  call  the  resulting  types  tQ  and  tr- 

Next,  an  ERT  <id<id.t(j>.t(i>  is  formed  for  the  bound  variable  and  its  type.  The  Expression 
component  is  simply  the  formal  parameter:  the  Type  component  is  the  function's  domain  type  (the 
type  of  the  forma!  parameter):  and  the  Required-environment  lists  only  the  formal  parameter.  This 
ERT  will  be  used  in  type  checking  the  body  of  the  function. 

Now  the  body  expression  exprb0(S^  is  evaluated  in  an  environment  augmented  by  the  binding  of  id  to 
the  F.RT  <id.<td.t(i>.t0>.  and  the  result  is  a  (type-checked)  ERT  <fb0(,y-rboay-,bo<,y'>-  To  senf> 
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function  body  really  does  return  the  declared  type,  the  Static-IL  Machine  must  have  /body  =  tr\  it  is  a 
"compiletime"  error  if  they  are  not  equal.24 

Because  Static-Phi  allows  the  programmer  to  write  functions  on  ERTs  --  like  macros  --  it  is  possible 
that  the  body  expression  references  a  variable  that  has  the  same  name  as  the  formal  parameter,  id.  but  a 
different  type.  (This  is  discussed  and  illustrated  in  Section  4.2.1.)  Therefore,  to  ensure  that  any  free 
instances  of  id  in  the  body  really  are  the  declared  ty  pe,  id  is  looked  up  in  the  required-environment 
rbody  10  veri^  11131  its  ^Pe  K  /d. 

Finally,  the  Static-IL  Machine  constructs  the  ERT  that  is  returned  by  ch»Ck-iambda.  The  Expression  ' 
component  is  (lambda  id  ebody).  The  Required-environment  component  is  just  the  required- 
environment  from  the  body  rbod  ,  with  the  formal  parameter,  id.  removed.25  The  Type  component  is 
<fun  rd.  rp>. 

4.2.3.4.  (ch*ck-ch«ck- lambda  id expr^  exprf  e*prbody  n) 

This  construct  is  used  to  generate  a  chack- lambda  IL  expression.  Subexpressions  exprQ.  exprf,  and 
expr body  ^  evalualc  10  ERTs;  an  ERT  is  always  returned.  Chack-chack-iambda  is  analogous  to 
check-funtyp#  in  that  it  waits  for  the  phase  when  n  =  0  before  generating  and  returning  a 
chock- lambda  expression.  For  other  phases  when  n  >  0.  it  just  decrements  n  and  returns  another 
check-chock-lambda  expression  in  the  resulting  ERT. 

Recall  that  the  purpose  of  the  check-lambda  construct  is  to  generate  type-checked  lambda 
expressions.  Similarly,  check-check-lambda  is  provided  for  generaung  type-checked  check-lambda 
expressions.  Remember  that  every  expression  must  be  guaranteed  type  correct  dunng  the  phase  before 
it  is  executed.  But  nonce  that  two  of  the  arguments  to  check- lambda  are  assumed  to  evaluate  to  types. 
Check-check- lambda  docs  the  type  checking  necessary  to  guarantee  that  those  two  arguments  will 
indeed  evaluate  to  types. 

At  this  point,  the  quesuon  usually  arises  as  to  whether  further  check-check-check-  or 
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.Most  languages  would  noi  actually  require  these  types  io  be  idenucal  cui  would  instead  require  onh  that  rbQd  be  a  type 
that  is  coercible  to  rp  Such  gratuitous  type  conversions  do  no:  make  a  language  fundamental!',  more  powerfuf  when  ihe 
programmer  could  just  as  well  explicitly  call  standard  tv  pe-con  version  functions  as  needed  Coercions  are  simpU  presided  for 
convenience 

2’Ihc  formal  parameter  id  is  a  free  variable  in  the  function  boo.  bui  looking  from  outside  a:  the  enure  lambda  expression 
it  is  bound  by  the  lambda 
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c h«ck -chtck -check- chack- lambda  constructs  might  be  needed.  Fortunately,  they  are  not,  and  the 
reason  is  that  for  check-check- lambda,  all  evaluated  arguments  evaluate  to  ERTs,  and  the  Static-Phi 
Translator  ensures  that  every  expression  will  initially  evaluate  to  an  ERT.  That  is,  the  purpose  of 
check -check- lambda  is  to  ensure  that  all  of  check- lambda's  evaluated  arguments  will  indeed  be  the 
expected  types,  and  it  is  required  because  two  of  check- lambda's  arguments  must  be  type  expressions. 
But  all  of  check-check- lambda’s  evaluated  arguments  must  be  ERTs.  And  since  the  Static-Phi 
Translator  only  generates  expressions  that  are  guaranteed  to  evaluate  to  ERTs,  no  further 
check -check-check  - lambda  is  needed  to  ensure  that  the  evaluated  arguments  to  check-check-lambda 
will  be  ERTs. 

Check-check- lambda  is  evaluated  as  follows.  Subexpressions  exprQ  and  exprr  are  evaluated  to  ERTs. 
If  n  >0.  their  type  components  must  be  ert;  otherwise  (when  n  =  0 ),  their  type  components  must  be 
type.  Next,  an  ERT  is  constructed  from  the  formal  parameter,  id,  for  use  in  type  checking  the  body. 
This  is  similar  to  check-lambda,  except  that  the  type  of  id  is  always  ert.  As  with  check-lambda,  the 
body  expression  exPrboby  is  evaluated  to  an  ERT  in  an  environment  augmented  by  this  binding.  If 
this  ERTs  required-environment  lists  the  formal  parameter  id.  its  type  should  be  ert.  Finally,  the 
return  ERT  is  constructed  from  the  Expression  and  Required-environment  components  of  the  ERTs 
obtained  from  evaluating  check-check-lambda’s  subexpressions.  If  n  >  0,  n  is  decremented  and 
another  check-check- lambda  is  generated  for  the  Expression  component;  otherwise  (when  n  =  0).  a 
check-iambda  is  generated  for  the  Expression  component.  In  either  case,  the  Type  component  is  ert. 
The  Required-environment  component  is  generated  by  combining  the  subexpressions'  required 
environments,  with  the  formal  parameter  removed.  However,  the  Static-IL  Machine  must  first  ensure 
that  these  required-environments  are  compatible:  any  identifier  listed  in  any  of  the  required- 
environments  must  be  listed  with  the  same  type  in  each  of  the  subexpressions'  required-environments. 

4. 2.3.5.  (chack- apply  ex prf  expr  ) 

This  construct  is  used  to  generate  an  apply  Siatic-IL  expression.  Subexpressions  exprf  3nd  expr 
both  evaluate  to  ERTs:  an  ERT  is  returned. 

T his  construct  is  different  from  the  others  in  that  the  Static-Phi  Translator  does  not  determine,  in 
advance.  the  phase  in  which  a  function  application  will  actually  uccui.  Instead,  the  chack-appiy 
operation  monitors  the  Type  component  from  ns  first  argument  to  see  when  it  will  become  a  function 
rather  than  an  ERT.  If  it  will  be  a  function,  the  function  s  domain  type  is  checked  against  the  type  of 
the  actual  parameter;  otherwise,  another  ch»ck-*ppiy  is  generated. 
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Chtck-tppiy  is  evaluated  as  follows.  First,  subexpressions  exprf  and  exprx  are  evaluated  to  ERTs 
<ef,rf,tf>  and  If  rf  is  a  fun  type.  <fun  t6,  /r>,  then  ia  must  equal  i%.  and  an  apply  ERT  is 

returned  with  Type  component  if.  Otherwise.  tf  must  be  ert  (it  is  a  "compiletime"  error  if  it  is  not), 
and  another  ci»«ck-appiy  ERT  is  returned  with  Type  component  art.  In  either  case,  the  Required- 
environment  component  of  the  resulting  ERT  is  formed  by  combining  rf  and  r  which  must  be 
consistent  Finally,  it  is  a  compiletime  error  if  rf  is  art  but  tx  is  not  art,  because  this  means  that  the 
function  argument  would  evaluate  to  some  final  basic  value  (such  as  a  number  or  boolean)  in  the  next 
phase,  whereas  the  function  expression  will  evaluate  to  another  ERT. 

4.2.4.  Efficiency  of  Static*IL 

Given  that  the  Static-IL  language  includes  both  compiletime  and  runtime  operations,  how  efficiently 
can  it  be  processed?  Must  it  be  less  efficient  than  a  conventional  lambda  calculus?  Might  it  be  more 
efficient?  Without  focusing  on  the  details  of  any  specific  implementation,  we  can  make  some  general 
observations  about  Static-IL’s  inherent  efficiency.  Since  the  Static-IL  Machine  is  used  both  for 
compiletime  and  runtime,  let  us  examine  these  roles  separately. 

On  one  hand,  when  the  Static-IL  Machine  is  playing  the  role  of  runtime,  the  operations  performed 
are  just  the  simple  operations  listed  in  Section  4.2.2.  These  are  in  fact  identical  to  the  operations  of  a 
conventional  untyped  lambda  calculus,  and  hence  can  be  just  as  efficiently  processed. 

On  the  other  hand,  when  the  Static-IL  Machine  is  playing  the  role  of  compiletime,  it  may  be  more 
efficient  than  a  conventional  compiler  because  it  can  use  the  basic  operations  of  the  runtime  machine 
directly  instead  of  simulating  them.  For  example,  constant  expressions  are  evaluated  directly  at 
compiletime  by  our  single  Static-IL  Machine,  whereas  a  conventional  compiler  must  evaluate  them  by 
simulating  the  action  of  the  runtime  machine. 

Henc„.  the  Static-IL  Machine  can  be  just  as  efficient  at  performing  runtime  operations  as  a 
conventional  runtime  machine,  and  may  be  more  efficient  at  performing  compiletime  operations  than 
a  conventional  compiler. 


4.3.  Environments  for  Static-IL  Programs 


As  discussed  in  Section  2.2.5,  environments  supply  bindings  of  identifiers  to  values,  for  all  of  a 
program's  free  variables.  In  our  simple  model,  the  environment  provides  the  input  for  a  Static-IL 
program,  and  a  separate  environment  must  be  provided  for  each  phase  used.  This  section  provides 
some  insight  into  the  purpose  of  these  environments. 

4.3.1.  Static-Phi  Program  X 

Let  us  begin  by  considering  a  Static-Phi  program  consisting  of  only  the  single  identifier,  x.  This 
Static-Phi  program  will  be  translated  to  the  ERT  <x,  <x.ert>.  *rt>.  The  Expression  component  is 
simply  x:  the  Required-environment  component  is  <x.#rt>,  meaning  that  the  only  free  variable  in  the 
expression  is  x  and  its  type  is  *rt:  and  the  Type  component  is  «rt,  because  the  expression  x'will 
evaluate  to  an  ERT. 

In  order  to  evaluate  the  Expression  component  x  we  must  provide  an  environment  that  satisfies26  the 
Required-environment.  In  this  case,  the  environment  must  include  a  value  of  type  »rt  for  x.  As 
mentioned  in  Section  3.3.4.  ever)’  idenufier  starts  out  as  type  ert,  that  is.  during  the  first  phase,  every 
identifier  must  be  bound  to  an  ERT.  Let  us  consider  some  of  the  possible  ERT  values  that  we  might 
prov  ide  for  x. 

Suppose  we  supplied  the  ERT  value  <x.  <x.«rt>.  *rt>  as  the  value  of  x  in  the  environment.  Then, 
in  the  first  phase,  the  expression  x  would  simply  evaluate  to  this  value  -  <x .  <x,ert>.  ert>.  But  this 
is  precisely  the  ERT  that  resulted  from  translating  the  original  Static-Phi  program!  In  effect.  «  has 
simply  evaluated  to  itself.  This  is  known  as  the  default  EPT for  x. 

4.3.2.  Definition:  Default  ERT 

For  any  identifier  id.  the  ERT  < id .  <id  en:.  sn;  is  called  the  default  ER T for  this  identifier. 


Satisf'u  .»  is  defined  ;n  Section  2 


4.3.3.  The  Purpose  of  Default  ERTs 


Default  ERTs  are  used  to  pass  identifiers  through  some  number  of  phases  before  fixing  their  types. 
(Fixing  an  identifier's  type  is  discussed  below  in  Section  4.3.4.)  They  are  called  "default"  ERTs 
because  a  command  interpreter  would  normally  provide  a  default  ERT  binding  for  each  identifier  of 
type  ert.  that  was  not  to  be  fixed  to  some  other  type. 

4.3.4.  Fixing  the  Type  of  an  Identifier 

Suppose  that  we  supply  a  slightly  different  ERT  value  for  x  in  the  environment: 
<x ,  < x ,  number) .  number).  This  looks  similar  to  the  default  ERT,  but  the  type  of  x  is  given  as  number 
in  the  Required-environment,  and  Expression’s  result  type  is  then  number  also.  If  we  supply  this  ERT 
-  as  the  value  for  x  in  the  environment,  then,  of  course,  x  evaluates  to  this  ERT  -- 
<x .  < x . number) .  number).  In  this  case,  even  though  the  Expression  component  is  again  x.  the  type  of 
x  is  now'  given  as  number,  that  is,  x  must  be  bound  to  a  number  in  the  next  phase.  Whereas  the  default 
ERT  simply  caused  x  to  evaluate  to  itself,  this  ERT  fixes  the  type  of  x  to  be  type  number. 

4.3.5.  Fixing  an  Identifier  as  a  Function  Type 

The  last  example  fixed  x  as  type  number.  We  could  just  as  well  fix  it  to  be  some  function  type.  For 
example,  if  we  provided  the  following  ERT  value  for  x  in  the  environment, 

u,  <x,(fun  number  number)),  (fun  number  number)) 

then  in  the  next  phase,  x  must  be  bound  to  some  function  from  numbers  to  numbers. 

4.3.6.  Fixing  an  Identifier  as  a  Macro 

We  have  just  showed  how  the  type  of  x  could  be  fixed  as  a  function  from  numbers  to  numbers.  If  we 
instead  fixed  the  type  of  x  as  a  function  from  ERTs  to  ERTs.  by  supplying  the  following  ERT  value  for 
x  in  the  environment, 

<x.  < x , ( f un  ert  ert),.  (fun  ert  ert), 

then  x  would  act  as  a  macro  in  the  next  phase.  That  is.  in  the  next  phase,  x  would  be  bound  to  some 
function  that  lakes  code  (an  ERT)  and  produces  code  (an  ERT)  as  its  result 


43.7.  Fixing  the  Value  of  an  Identifier 

In  the  last  three  examples.  *  was  bound  to  an  ERT  that  fixed  the  type  of  *  for  the  next  phase, 
requiring  «  to  be  bound  to  some  number  or  a  function  during  the  next  phase.  Thus,  the  type  of  *  was 
fixed  for  the  next  phase,  but  the  value  of  *  was  not  fixed  for  the  next  phase.  Suppose  we  had  instead 
bound  *  to  the  ERT  <  •  5 .  <> ,  number>.  In  this  case,  not  only  is  the  type  fixed  for  the  next  phase,  but 
the  value  is  a  constant:  5.  Since  the  Expression  component  of  this  ERT  has  no  free  variables,  the 
idenufier  *  does  not  even  appear  in  the  Required-environment 

4.3.8.  Other  Possibilities 

Of  course,  these  are  not  the  only  interesting  ERT  values  that  might  be  bound  to  «.  For  example, 
suppose  the  ERT  <y.  <y.«rt>.  «rt>  were  provided  as  the  value  of  *.  This  ERT  is  identical  to  the 
default  ERT  for  *  except  that  it  uses  the  identifier  y  instead.  This,  in  effect  renames  *  to  y  for  the  next 
phase. 

So  far  we  have  discussed  some  of  the  ERTs  that  might  be  supplied  in  the  environment  as  values  of  a 
Siauc-IL  program’s  free  variables.  Of  course,  free  variables  of  other  types,  such  as  number  or  (fur 
numoer  number),  would  have  to  be  bound  to  values  of  those  types  in  the  environment. 


4.3.9.  What  Values  to  Supply  in  What  Phases 

Since  a  different  environment  is  supplied  for  each  phase,  the  quesuon  anses  as  to  what  each  of  these 
environments  should  include.  Of  course,  the  Required-environment  specifies  the  types  of  the  values 
that  must  be  provided  m  the  environment  but  it  does  not  tell  the  purposes  of  these  values.  In 
particular,  there  were  several  different  kinds  of  ERT  values  discussed  above  that  might  be  used  for  an 
identifier  of  type  ert.  How  do  we  know  which  is  appropriate? 


I  he  answer  depends  on  the  program  and  the  programmer's  intent.  Every  program  will  be  expecting 
certain  kinds  of  input  via  the  environment  in  certain  phases.  Information  on  the  kind  of  input 
expected  in  each  phase  (other  than  its  ty  pe)  must  be  provided  as  external  documentauon.  in  the  same 
v  u  ■  ui.i.  die  p  u  *  poses  of  an.  ccr.,  Cmiunul  P*  ...r  cm  s  .tip  u  t$  must  be  documented . 


There  is.  however,  a  pattern  to  the  types  of  the  free  variables  that  one  would  generalh  expect  to  sec 
in  anous  phases.  Since  every  idenufier  suns  out  type  ert  (as  mentioned  m  Section  3.3  4)  the  default 
FR  I  would  initially  be  used  for  that  identifier  Then,  during  some  phase  the  type  of  this  idenufier  will 


be  fixed  to  some  basic  (non-ERT)  type,  as  described  above,  and  finally  during  the  next  phase  the 
identifier  will  have  a  value  of  that  basic  type.  Thus,  pertinent  documentation  on  this  identifier  should 
specify  during  what  phase  its  type  should  be  fixed. 


The  examples  in  Section  4.1.4  help  provide  an  understanding  of  how  various  phases  are  used  and 
what  happens  in  each  phase. 

4.4.  Examples  of  Translation  and  Evaluation 

This  section  shows  examples  of  translating  all  of  the  Static-Phi  programs  shown  in  Section  4.1.4  to 
Stauc-IL  programs,  and  executing  the  resulting  Static-IL  programs  through  phases.  The  most  straight¬ 
forward  and  informative  examples  with  which  to  begin  are  the  two  that  involve  creating  and  applying 
an  identity  function  in  Sections  4.4.3  and  4.4.9.  The  Static-Phi  Translator  and  Static-IL  Machine  are 
formally  defined  in  Appendix  A. 

In  the  examples  below,  ERT  values  <e,r,i>  are  displayed  in  a  LISP-like  form: 

< ert ) 

Similarly,  environments  are  displayed  as  LISP-like  lists.  Each  element  of  the  environment  lists  an 
identifier-value  binding,  which  is  in  turn  displayed  as  a  LISP-like  list,  for  example: 

( 

( id.  value j ) 

( id  j  value  2) 

( id  value  ) 

n  n 

) 

Finally.  Required-environments  are  also  displayed  as  LISP-like  lists.  Each  element  of  the  Required- 
enwronment  lists  an  idenufier-lype  pair,  which  is  m  turn  displayed  as  a  LISP-like  list  for  example: 

( id,  type  j) 

( >d:  type, ) 

i'dr  typen) 

) 
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4.4.1.  F  Twice 


This  example  is  complicated  by  the  need  for  a  (non-empty)  environment.  Therefore,  the  reader  is 
advised  to  first  study  the  example  of  Section  4.4.3.  Identity  Application,  which  involves  no  free 
variables. 

(f(fx)) 


. —  Result  of  Translation 

( 

(check-apply  f  (check-apply  f  *)) 

((f  art)  (*  art)) 

art 

) 


Expression 

Req-env 

Type 


, -  Environment  for  Phase  1  - 

( 

( x  ( x  ( { *  number))  number)) 

(f  (f  ((f  (fun  number  number)))  (fun  number  number))) 

) 


; .  Result  of  Phase  1 

( 

(apply  f  (apply  f  *)) 

( ( f  (fun  number  number))  (*  number)) 
number 

) 


Expression 

Req-env 

Type 


( 

l  * 
(f 

) 


Environment  for  Phase  2 


5) 

( e 1 osure  z  (  incr  z  )  (  1 ) ) 


Result  of  Phase  2 


r 


4.4.2.  Identity  Abstraction 

See  the  example  of  Identity  Application  in  Section  4.4.3. 

X  x  :  number  -*•  number .  x 


Result  of  Translation 


(check-check- lambda 
x 

(deep-const  number  type  0) 
(deep-const  number  type  0) 


Expression 


x 

0 

) 

O  ;  Required-environment 

ert  ;  TyP» 

) 


; - -  Environment  for  Phase  1 

0 


Result  of  Phase  1 


(check-lambda  x  'number  'number  x) 

() 

ert 


) 


Expression 

Requ 1 red-env i ronment 
Type 


Environment  for  Phase  2 


Result  of  Phase  2 


(lambda  x  >1 

(  ) 

(fun  nunber  number) 


:  Expression 
.  Sequ  i  red -env i ronment 
.  Type 


Environment  for  Phase  3 


(closure  x  x  (  ) ) 


Result  of  Phase  3 


4.4.3.  Identity  Application 


This  is  the  best  of  these  examples  to  study  first. 

The  correspondence  between  the  Static-Phi  program  and  the  ERT  that  results  from  translation 
(shown  below)  is  as  follows.  To  dispense  with  the  easy  parts  first,  the  Required-environment 
component  of  the  ERT  is  empty,  because  there  are  no  free  variables,  and  the  Type  component  is  art. 
indicating  that  the  result  of  the  first  phase  will  be  an  ERT,  as  it  always  is.  In  the  Expression 
component,  the  function  application  of  the  original  Static-Phi  program  has  been  translated  to  a 
check-appiy  StaticTL  construct.  The  X  abstraction  was  translated  to  a  check-check-lambda.  using  the 
formal  parameter  name;  the  type  constants  number  and  number  were  translated  to  deep-const  forms, 
listing  the  number  of  phases  to  wait  as  o;  and  the  identifier,  x,  supplied  as  the  function  body,  was 
simply  translated  to  itself,  x.  The  generated  check-cheek-iamboa  lists  the  number  of  phases  to  wait  as 
o.  Finally,  the  constant  5  that  was  given  as  the  actual  argument  was  translated  to  another  deep-const 
form,  lisung  the  number  of  phases  to  wait  as  l.  Note  that  the  deep-consts  generated  for  the  number 
type  constants  have  one  fewer  phases  to  wait  than  the  deep-const  generated  for  the  function's  actual 
argument  5.  This  is  because  the  type  values  will  be  needed  to  type  check  the  function  application,  one 
phases  earlier  than  the  function  is  applied  to  the  constant  5. 

The  progression  through  phases  is  as  follows.  During  the  first  phase,  the  types  of  the  function's  type 
expressions  (number  and  number)  are  checked  to  ensure  that  they  really  are  type  expressions  and  not 
say.  numeric  expressions.  Upon  doing  this  check,  the  chack-cnack-iambda  produces  a  check-lambda. 
Dunne  the  second  phase,  these  type  expressions  will  be  evaluated  to  the  type  values  number  and  number 
and  these  types  will  be  used  to  generate  a  type-checked  lambda.  In  turn,  the  check-appiy  verifies  that 
the  type  of  the  actual  argument  matches  the  function's  declared  formal  parameter  type,  and  generates 
an  apply  form.  Finally,  in  the  third  phase,  the  function  is  applied  to  the  constant  5  to  produce  a  result 
of  5 

The  original  Static-Phi  program  and  the  progression  through  phases  are  shown  below.  Compare  this 
example  with  the  example  in  Secuon  4.4.9. 


Static-Phi  Program 


(X  x  :  number  -►  number .  x  5) 


Result  of  Translation 


(check-apply 

(check -check- lambda 
x 

(deep-const  number  type  0) 
(deep-const  number  type  0) 


(deep-const  5  number  1) 

) 

() 

ert 


;  Expression: 


Formal  parameter 
Domain  type 
Range  type 
Expression  body 
Phases  to  wait 

Actual  argument 


Requi red-environment 
Type 


Environment  for  Phase  1 


• .  Result  of  Phase  1 

( check-apply 

( check- 1 ambda  x  'number  'number  x) 
(deep-const  5  number  0) 


;  Expression 


;  Required-environment 
:  Type 


AS 

m 


Environment  for  Phase  2 


v:< 


Result  of  Phase  2 


( app l y  ( 1 ambda  x  x )  ' 5 ) 

(  ) 

number 


.  Expression 
.  Requi red-env i ronment 
•  Type 


Environment  for  Phase  3 


& 


>\>\^ 


•■y.'.v 
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Result  of  Phase  3 


4.4.4.  Function  Abstraction 


X  x  :  number  -*  number .  (succ  (succ  x)) 


Result  of  Translation 


( check-check- 1 ambda  *  .  Expression 

(fleep-const  number  type  0) 

(deep-ccnst  number  type  0) 

(check-apply  succ  (check-apply  succ  »)) 


((succ  ert)) 

ert 


Req-env 

Type 


-  Environment  for  Phase  1 

(succ  (succ  ((succ  ert))  ert)) 


Result  of  Phase  1 


(check-lambda  *  'number  number  .  Expression 

(check-apply  succ  (check-apply  succ  *1) 


( ( succ  ert  / ) 
ert 


.  Req-env 
•  TyP« 


Environment  for  Phase  2 


(succ  (Succ  ( ( SuCC  (fun  number  number)))  (fun  number  number;)) 


Result  of  Phase  3 


(lambda  *  (apply  succ  (apply  succ 
((succ  (fun  number  njmber;;) 

(fun  number  number! 


E xpress ion 
Req- env 
T  voe 


.'-.M 


I*-.* 


r-.*v 
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Environment  for  Phase  3 


(succ  (closure  z  (incr  z)  ())) 


Result  of  Phase  3 


(closure 


(apply  succ  (apply  succ  x)) 
((succ  (closure  z  (incr  z)  ()))) 


4.4.5.  Function  Application 


(A  x  :  number  -*•  number .  (succ  (succ  x))  5) 


Result  of  Translation 


(check-apply 

(check -check -lamb da 


;  Expression 


(deep-const  number  type  0) 

(deep-const  number  type  0) 

(check-apply  succ  (check-apply  succ  x)) 
0 


(deep-const  5  number  1) 


( ( succ  ert ) ) 


;  Req-env 
;  Type 


Environment  for  Phase  1 


(succ  (succ  ((succ  ert))  ert)) 


M45 


\-V* 


*  r  V 

.N.N 

.•Vw 


v  vv-  •/  ,-  *.•  v  ■  ■  .■ 

«V  V..-  '■'.•sZs'j'.-'  .*  V.'.-.'- 

**  j  * 


VvVvv.v  v’vv 


A  .S  A  .‘ 


.  .  .  -  *>V 

’  s’  %’  \*  *  *  •,*  •.*  ( 

*.  N.  *.  \  1 


A  .V.%  .  ..VA 


■J  V'W  WJ-^  ‘  IT  1  *  !  '  '*"  '  ^ 


j-y  -.■  ry ;  a."  JV  ~y  V?  W  V  -7.'W.' 


iTy*™ 


Result  of  Phese  1 


(check-apply  ;  Expression 

(check-lambda 
x 

’ number 
•  number 

(check-apply  succ  (check-apply  succ  x)) 


) 

(deep-const  5  number  0) 


) 

((succ  ert)) 
ert 


:  Req-env 
:  Type 


mj 

W 


'PA 


£ 


Environment  Tor  Phase  2 


(succ  (succ  ((succ  (fun  number  number)))  (fun  number  number))) 


Result  of  Phase  2 


* 

I 


(apply  (lambda  x  (apply  succ  (apply  succ  x)))  5)  ;  Expression 

((succ  (fun  number  number)))  ;  Req-env 

number  •  TVP® 


$ 
& 
3$ 


Environment  for  Phase  3 


(succ  (closure  z  finer  z)  ())) 


m 


Result  of  Phase  3 


'Sb 

#2 

yt«C 
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f>0 
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4.4.6.  Higher  Order  Function  Abstraction 


X  f :  (funtype  number  number)  -*  number .  (f  (f  x)) 


Result  of  Translation 


( check- check  - 1 ambda 
f 

( check-funtype 

(deep-const  number  type  0) 
(deep-const  number  type  0) 
0 


;  Expression 


) 


(deep-const  number  type  0) 
(check-apply  f  (check-apply  f  x)) 
0 


((*  ert)) 
ert 


;  Req-env 
;  Type 


(* 


-  Environment  for  Phase  1 

(x  ((x  ert))  ert)) 


Result  of  Phase  1 


(check-lambda 
f 

(funtype  number  'number) 

'  number 

(check-apply  f  (check-apply  f  x)) 


:  Expression 


) 


((»  «rt)) 
ert 


Req-env 


(* 


-  Environment  for  Phase  2 

(x  ((x  number))  number)) 
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m 
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Result  of  Phase  2 


Result  of  Phase  1 


( 

(check-apply  ;  Expression 

(check-lambda 
f 

(funtype  'number  'number) 

' number 

(check-apply  f  (check-apply  f  *)) 

) 

9 

) 

((x  «rt)  (g  ert))  ;  Req-env 

ert  ;  Type 

) 


- -  Environment  for  Phase  2  - 

( 

(x  (x  ( ( x  number))  number)) 

(g  (g  ((g  (fun  number  number)))  (fun  number  number))) 

) 


( 


) 


.  Result  of  Phase  2  . . 

(apply  (lambda  f  (apply  f  (apply  f  x)))  g)  ;  Expression 
((x  number)  (g  (fun  number  number)))  ;  Req-env 
number  ;  Type 


; . .  Environment  for  Phase  3 

( 

(«  5) 

(g  (closure  z  ( Incr  z)  (  ))) 

) 


Result  of  Phase  3 
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4.4.8.  Identity-Function  Type  Abstraction 


X  t :  type  -*  ert .  (emit  X  x  :  t  -*  t .  x) 


Result  of  Translation 


( check- check- lambda 


(deep-const  type  type  0) 
(deep-const  ert  type  0) 
(check-eheck-lambda  *  t  t  x  1) 
0 


Environment  for  Phase  1 


Result  of  Phase  1 


(check-lamOda  t  type  'ert 

(check-check-lambda  x  t  t  x  0) 


Environment  for  Phase  Z 


Result  of  Phase  Z 


(lambda  t  ( check - 1 ambda  *  t  t 


(fur.  type  ert; 


Environment  for  Phase  3 


. .  Result  of  Phase  2 

(Closure  t  i check- lambda  i  t  t  * i  ( ’  ; 


Expression 


Req-env 

Type 


;  Expression 


;  Req-env 
.  Type 


.  Expression 
.  Req-env 
Type 


•  .  'V 

AV-'.V.  Va 


/ 1 


4.4.9.  Identity-Function  Type  Application 

This  example  requires  one  more  phase  than  the  example  of  Section  4.4.3.  If  we  view  what  happens 
in  the  various  phases  in  terms  of  the  original  Static-Phi  program,  phase  1  checks  the  types  of  the  type 
expressions  in  the  outer  X  abstraction;  phase  2  type  checks  the  outer  X  abstraction  and  its  application, 
and  the  types  of  the  type  expressions  in  the  inner  X  abstraction;  phase  3  applies  the  outer  function  to 
the  actual  argument  and  produces  an  ERT  for  a  type-checked  identity  function  on  numbers;  and 
during  phase  4  this  identity  function  becomes  an  actual  function  value,  or  closure,  that  could  have  been 
applied  to  a  numeric  argument. 

; . Static-Phi  Program . 

(X  t :  type  -*  crt .  (emit  Xx;t-*t.x)  number) 


Result  of  Translation 


(check-apply  .  Expression 

( check-check- 1 ambda 
t 

(deep-const  type  type  0) 

(deep-const  ert  type  0) 
(check-check-lambda  x  t  t  x  1) 

0 


) 

(deep-const  number  type  l) 


) 

0 

art 


.  Req-env 
.  Type 


(  ) 


Environment  for  Phase  1 
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Result  of  Phase  1 


i 


I 

l 

i 


(check-apply 

(check- lambda 
t 

'type 

'ert 

(check-check- 1 ambda  *  t  t  *  0) 

) 

(deep-const  number  type  0) 


( )  ;  Req-env 

ert  ;  Type 


() 


Environment  for  Phase  2 


Result  of  Phase  2 


(apply  (lambda  t  ( check- 1 ambda  x  t  t  a))  ’number)  ■  Expression 
(  )  ;  Req-env 

art  ;  Type 


- - -  Environment  for  Phase  3 

0 


( 


) 


Result  of  Phase  3 


(lambda  x  x) 

l  ) 

(fun  number  number) 


Expression 

Req-env 

Type 


( 


Environment  for  Phase  4 


( closure 


.  Result  of  Phase  4 

d) 
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4.4.10.  General  Type  Abstraction 


X  t :  type  -►  ert .  (emit  X  f :  (funtype  1 1)  -*  t .  (f  (f  x))) 


Result  of  Translation 


(check- check- lambda 
t 

(deep-const  type  type  0) 
(deep-const  ert  type  0) 

( check-check- lambda 
f 


;  Expression 


( check-f untype  t  t  1) 
t 

(check-apply  f  (check-apply  f  x)) 
1 


((x  art)) 
ert 


.  Req-env 
:  Type 


(* 


-  Environment  for  Phase  I 

(x  ((x  art))  art)) 


Result  of  Phase  1 


( check-lambda 
t 


.  Expression 


'  type 
'art 

( check -check- lambda 
f 

( check -f untype  t  l  C) 
t 

(check-apply  f  (check-apply  f  x)) 

C 


) 


((*  «rt)  ) 

ert 


Req-env 

Type 
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Environment  for  Phase  2 


(x  (x  ((x  ert))  ert)) 


Result  of  Phase  2 


(lambda 


Expression 


(check-lambda 

f 

(funtype  t  t) 


((*  *rt)) 

(fun  type  ert) 


(check-apply  f  (check-apply  f  a)) 


;  Req-env 
:  Type 


Environment  for  Phase  3 


(x  (x  ((x  number))  number)) 


Result  of  Phase  3 


(closure 


(check-lambda  f  (funtype  t  t)  t 

(check-apply  f  (check-apply  f  x)) 

) 

((x  (x  ((x  number))  number))) 
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4.4.11.  General  Type  Application 


(X  t :  type  -*  ert .  (emit  X  f :  (funtvpe  1 1)  -» t .  (f  (f  x)))  number) 


Result  of  Translation 


(check-apply 

(check-check- lambda 
t 

(daap-const  type  type  0) 
(deep-const  art  type  0) 

( chack-chack- lambda 
f 


Expression 


(chack-funtype  t  t  1) 
t 

(check-apply  f  (chack-apply  f  x)) 
1 


) 


(daap-const  number  type  1) 


) 


((x  art)) 
ert 


;  Req-env 
:  Type 


( * 


----  Environment  for  Phase  1 
(x  ((x  art))  art)) 


*9 


i¥ 


>.vl 

m 


a 


C-'.-'l 


Result  of  Phase  1 


( check-apply 

(check-lambda 


;  Expression 


type 

'ert 

(check -check- lambda 
f 

(check-f untype  t  t  0) 

t 

(check-apply  f  (check-apply  f  x )} 
0 


(deep-const  number  type  0) 


((*  ert)) 


:  Req-env 
•  Type 


Environment  for  Phase  2 


(*  («  ((*  ert))  ert)) 


Result  of  Phase  2 


( apply 

(lambda 


;  Expression 


( check- 1 ambda 


(funtype  t  t) 


(check-apply  f  (check-apply  f  »)) 


i  ( «  ert ; ) 


.  Req-env 
.  Type 


Environment  for  Phase  3 


( *  (*  ((x  number ) )  number)) 


tot 

.  > 


Result  of  Phase  3 


( 

(lambda  f  (apply  f  (apply  f  *))) 
((x  number)) 

(fun  (fun  number  number)  number) 

) 


;  Expression 
;  Req-env 
;  Type 


( 

) 


(* 


Environment  for  Phase  4 


5) 


. .  Result  of  Phase  4  - 

(closure  f  (apply  f  (apply  f  *))  ((*  5))) 

4.4.12.  Macro  Abstraction 

X  m  :  ert  -*■  ert .  (m  (m  x)) 


( 


) 


Result  of  Translation 


(check-check-lambda  •  Expression 

m 

(deep-const  ert  type  0) 

(deep-const  ert  type  0) 

(check-apply  m  (cheek-apply  m  x)) 

0 


) 

((*  «rt ) ) 
ert 


;  Req-env 
;  Type 


—  Environment  for  Phase  1 
(x  ((x  ert))  ert)) 


Result  of  Phase  1 


(check-lambda  m  'ert  'ert  •  Expression 

(check-apply  m  (check-apply  m  x)) 


) 

((x  ert)) 
ert 


,  Rea-env 
,  Type 


Environment  for  Phase  2 


(x  (x  ((x  ert))  ert)) 


Result  of  Phase  2 


(lambda  m  (check-apply  m  (check-apply  m  x))) 
((*  #ft ) ) 

(fun  ert  ert) 


Expression 

Req-env 

Type 


Environment  for  Phase  3 


(x  (x  ((x  ert))  ert)) 


Result  of  Phase  3 


( closure 


(check-apply  m  (check-apply  m  x)) 
((x  (x  ((x  ert))  ert))) 
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4.4.13.  Macro  Application 


(X  m  :  ert  -►  ert .  (m  (m  x)) 

(emit  X  v  :  number  -*•  number .  (succ  (succ  y))) 

) 


) 


Result  of  Translation 


(check-apply  ;  Expression 

( check-check- 1 ambda 
m 

(deep-const  ert  type  0) 

(deep-const  ert  type  0) 

(check-apply  m  (check-apply  m  x)) 

0 


) 

( check-check- 1 ambda 


y 

(deep-const  number  type  1) 

(deep-const  number  type  1) 

(check-apply  succ  (check-apply  succ  y)) 
1 


) 

) 

( ( «  ert )  ( succ  ert )  ) 
ert 


;  Req-env 

;  Typ* 


( succ 

(  X 


-  Environment  for  Phase  1 

(Succ  ((succ  ert))  ert)) 

(x  ( ( x  ert  1  )  ert )  ) 


Result  of  Phase  1 


(check-apply 

(check- lambda 


;  Expression 


(check-apply  m  (eheck-apply  m  * ) ) 


(check- check- lambda 


(deep-const  number  type  0) 

(deep-const  number  type  0) 

(check-apply  succ  (check-apply  succ  y)) 
0 


((x  ert)  (succ  ert)) 


:  Rep-env 
;  Type 


Environment  for  Phase  2 


(succ  (succ  ((succ  ert))  ert)) 

(x  (x  ((x  ert))  ert)) 


Result  of  Phase  2 


(apply  ;  Expression 

(lambda  m  (check-apply  m  (check-apply  m  x))) 

( check- 1 ambda 

y 

' number 
’ number 

(check-apply  succ  (check-apply  succ  yi 


( (  »  ert )  (  succ  ert ) ) 


,  Req-env 
.  Type 


Environment  for  Phase  3 


(x  (x  ((x  number))  number)) 

'SUCC  (SUCC  ((SUCC  (fun  number  nunberjl)  (fyn  number  number 


Mb* 
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Result  of  Phase  3 


(apply 


) 

((*ucc 

numbar 


(* 

(  succ 


.  Expression 

(lambda  y  (apply  succ  (apply  succ  y ) ) ) 

(apply  (lambda  y  (apply  succ  (apply  succ  y ) ) )  x) 

(fun  numbar  numbar))  (x  number))  ,  Req-env 

■  Type 


Environment  for  Phase  a 


6) 

( closure  a  (  mcr  i  /  (  ) ) ) 


Result  of  Phase  * 


Chapter  5 

Using  Phases  for  Partial  Evaluation 


This  chapter  describes  how  phases  might  be  used  to  perform  partial  evaluation.  In  this  approach,  the 
phase  for  a  particular  subcomputation  is  not  denoted  in  the  source  program  or  determined  when  the 
program  is  translated,  but  is  determined  dynamically  by  the  environment  supplied  for  each  phase. 
This  approach  has  not  been  fully  explored,  and  is  open  for  future  research,  but  we  demonstrate  how  it 
might  proceed  by  describing  a  source  language.  Dynamic-Phi.  and  the  beginnings  of  an 
implemeniauon  language.  Dynamic- 1 L.  There  is  no  formal  semantics  given  for  these  languages,  since 
they  are  not  fully  developed. 

5.1.  The  Dynamic-Phi  Language 

The  Dynamic-Phi  language  is  identical  to  the  Static-Phi  language  described  in  Section  4.1.  except  it 
does  not  provide  the  emit  and  eval  constructs  or  the  type  constant  ert:  hence  Dvnamic-Phi  is  not 
discussed  further  here.  Instead  of  allowing  the  programmer  to  explicitly  manipulate  ERTs  under 
program  control,  the  system  uses  ERT  values  transparently,  to  represent  the  results  of  a  parual 
evaluation. 

5.2.  The  Dynamic-IL  Language 

A«.  with  the  Static-IL  language.  Dynamic-IL  looks  like  an  untyped  lambda  calculus  because  type 
declarations  arc  not  explicit,  but  in  fact  it  is  strongly  typed.  In  fact  almost  all  of  the  constructs  of 
Static  11  and  Dynamic-IL  look  similar  though  the  semantics  of  constructs  that  manipulate  ERT  values 
arc  necessarily  different  as  discussed  in  Section  5.2.2. 


5.2.1.  Conventional  Lambda  Calculus  Operations 

Dynamic- 1L  has  the  following  simple  operations  that  look  like  a  conventional  lambda  calculus 
written  in  the  LISP  [McCarthy  66]  style.  These  function  exactly  the  same  as  in  Static-IL.  They  are: 

'  ev  Any  quoted  expressible  value. 

id  An  identifier. 

( i  ambda  id  expr )  Function  abstraction. 

(apply  ex prf  expr%  ) 

Function  application. 

(funtypa  exprf  expr ■  ) 

The  type  of  a  function.  As  with  Static-IL.  the  returned  function  type  is  represented 
as  a  pair,  tagged  with  the  word  fun:  <f  un  iype0,  iyper>. 

These  operations  are  not  discussed  further  here. 

5.2.2.  Other  Operations 

All  but  one  of  the  other  operations  look  similar  to  operations  m  Static-IL,  except  that  the  operations 
below  lack  the  n  parameter,  and  hence  their  the  semantics  are  somewhat  different.  In  Static-IL.  the  n 
parameter  specified  how  many  phases  to  wait  before  generating  one  of  the  conventional  operations, 
and  this  was  determined  statically,  during  translation.  But  in  Dynamic-IL.  the  determination  of  when 
to  generate  one  of  the  conventional  operations  is  done  dynamically  by  each  of  the  operations  listed 
below.  Compared  with  Static-IL.  Dynamic-IL  is  missing  one  operation.  deep-consi,  and  contains  one 
new  operauon.  hold.  Deep- const  is  unnecessary  because  there  is  no  a  prion  determination  of  when  a 
constant  will  be  needed:  hold  is  now  used  to  pass  a  value  computed  in  one  phase,  to  the  next  phase. 

5.2.2. 1 .  (hold  /  expr) 

Hold  always  returns  an  ERT.  The  argument  t  may  be  any  value  of  type  type  --  it  is  not  an  expression 
-■  and  expr  is  an  expression  that  evaluates  to  a  value  of  that  type.  Hold  simply  evaluates  expression 
expr  to  some  value  ev,  and  returns  the  ERT  <  •  .  'v  Thus,  e'en  though  expr  j<  e'aluated  during 

this  phase,  its  value  is  not  used  until  the  next  phase. 


Hold  is  typically  used  to  synchronize  two  values  that  are  needed  by  an  operation.  Each  of  the  check- 
operations  must  detect  this  and  generate  ho  i  ds  as  needed.  For  example,  the  f untype  operation  has  tw  o 


subexpressions  that  must  evaluate  to  types.  During  the  phases  before  the  subexpressions  evaluate  to 
types.  they  will  evaluate  to  HRTs.  What  if  one  of  the  subexpressions  is  ready  to  evaluate  to  a  type 
during  some  phase,  but  the  other  is  still  going  to  evaluate  to  an  ERT.  and  will  not  evaluate  to  a  type 
until  the  following  phase?  In  that  case,  the  subexpression  that  is  ready  to  evaluate  to  a  type  can  be 
evaluated,  and  the  hold  operation  can  be  used  to  pass  the  resulting  type  value  on  to  the  next  phase, 
when  the  other  subexpression  will  also  evaluate  to  a  type.  Thus  ch*ck-furvtype  can  force  both 
subexpressions  to  return  ERTs  during  one  phase,  and  during  the  next  phase,  the  t untype  operation  will 
have  both  type  values  as  needed.  Section  5.22.2  explains  specifically  how  this  works  for  the 
chock-funtyp*  operation.  Other  cn«cx-  operations  work  analogously. 

S.2.2.2.  (ch«ck-funtyp*  expr d  exprf  ) 

Check-r untype  always  returns  an  ERT:  it  is  used  to  generate  a  funtyp«  ERT.  However,  unlike  in 
Static-IL.  ch«ck-r untyp#  will  not  necessarily  generate  a  funtyp«  expression  during  this  phase.  If  its 
arguments  will  not  be  ready  to  be  fully  evaluated  to  types  during  the  next  phase  (that  is,  if  its 
arguments  are  still  going  to  evaluate  to  ERTs).  another  chtck-funtype  expression  is  generated.  This  is 
similar  to  the  way  ch*ck-appiy  in  Static-IL  generates  an  apply  if  the  function  arguments  will  be  ready 
in  the  next  phase,  and  a  ctiack-appiy  if  not. 

Chack - f  omy P*  is  evaluated  as  follows.  Both  subexiessions  exprr  and  expra  are  evaluated  to  ERTs. 
call  them  <Yd.rd./d>and  <er.rr.ir>.  If  both  tg  and  tf  are  typa.  a  funtype  expression  is  generated,  as  in 
Static-IL.  If  both  i  and  tf  are  art.  the  returned  ERT  will  contain  a  check  -runtypa  expression:  the 
Expression  component  will  be  (check-funtypa  eg  er)\  the  Required-environment  component  will  be 
the  combination  of  r •  and  rp,  which  must  be  consistent  (as  defined  in  Section  4.2. 3.2):  and  the  Type 
component  will  be  art. 

Note  that,  for  a  funtypa  expression  to  be  generated,  both  and  ir  must  be  type,  indicating  that  eg 
and  e  will  evaluate  to  types  Since  every  expression  starts  out  (after  translation)  evaluating  to  an  FRT. 
in  effect  i  and  tf  will  sun  out  as  art  and  will  become  type  during  some  later  phase  But  what  if  one 
of  the  check  -  f  tintype's  subexpressions  is  ready  to  evaluate  to  a  type  before  the  other  is  ready”'  That  is. 
what  if  either  /  or  /f  is  type,  but  the  other  is  en° 

In  this  case,  we  can  simply  allow  the  type  value  to  be  computed  during  the  next  phase,  but  use  a  h0ia 
expression  to  pass  the  result  on  to  the  next  phase,  to  be  ready  during  the  phase  when  the  other 
subexpression  is  also  ready  to  evaluate  to  a  type  Thus,  a  ch*ck- funtyp*  is  generated  as  before,  but  a 
hoia  n  inserted  to  pass  the  type  value  to  a  vubsequent  phase  as  a  constant  For  example,  if  ig  is  typ* 
and/r:sft  the  rcsulung  F.xpression  component  will  be  < ch»ck-f untyp*  (hcia  typ*e  ;rr). 


It  is  a  "compileu^le,,  error  if  either  /d  or  tf  is  not  type  or  trt.  J 

i 

5.113.  (check-lambda  id expr d  ejrprf  «prbody  )  | 

t 

Analogous  to  ch«ck-funtyp«,  chack- lambda  is  used  to  generate  a  lambda  but  should  generate  another  \ 

chack- lambda  if  the  body  expression  will  not  be  ready  during  the  next  phase  (that  is,  if  the  body  I 

expression  evaluates  to  an  ERT  whose  Type  component  is  art).  If  another  chack- 1  ambda  is  generated, 
the  type  expressions  will  simply  be  the  quoted  type  values  that  were  computed  during  this  phase. 

The  implementation  of  chack-iambda  is  not  as  straightforward  as  it  may  at  first  seem:  its  discussion  is 
postponed  to  Section  5.3.  | 

5.2.14.  (chack -chack- lambda  id expr d  exprr  exprbo<} ) 

This  construct  is  handled  straightforwardly  in  a  manner  analogous  to  chtck-funtyp#  above.  After  ' 

evaluating  exprQ  and  exprf,  an  ERT  binding  (with  Type  component  art)  is  created  for  the  formal 
parameter,  and  the  body  expression  exprbo is  executed  in  an  environment  augmented  b>  this 
binding.  If  both  expr  and  ex prr  have  evaluated  to  ERTs  whose  Type  component  is  typa.  a 
chack-iambda  will  be  generated.  Otherwise  another  chack-chack-iambda  should  be  generated,  with 
hold  used  as  necessary. 

5.2.15.  (check-appiy  ex prf  expr  ) 

This  construct  is  evaluated  as  in  Siatic-IL,  except  that  hold  may  be  inserted  as  needed  if  either  the 
function  or  the  argument  is  ready  before  the  other  (that  is.  if  one  will  still  be  an  ERT  when  the  other 
will  be  a  function  or  non*ERT  value  during  the  next  phase). 

5.3.  Problems  in  Implementing  Check*Lambda 

Before  discussing  these  issues,  it  should  first  be  noted  that  there  are  se\cral  wa\s  of  parualh 
e\aluating  function  calls.  Beckman  et  al.  [Beckman  ’6)  pro\  ide  a  good  outline  of  the  \arious  methods 
We  will  restrict  our  attention  to  the  simplest  choice. 

Suppose  we  have  the  following  lambda  abstraction  in  D\namic-Phi.  w  hich  has  free  \anable  f: 


\  x  :  number  -*  number .  (f  x) 


And  consider  the  corresponding  Dynamic-IL  program: 

(chack-lambda  *  'number  number  (chack-apply  f  *)) 


where  f  is  type  ert. 

Now.  our  goal  here  is  to  come  up  with  a  method  of  implementing  the  cheek- 1  ambda  operation.  As  a 
general  outline,  it  should  proceed  according  to  the  following  steps: 

1.  Evaluate  the  two  type  subexpressions.  In  this  example,  they  are  number  and  number,  and 
they  simply  evaluate  to  the  type  values  number  and  number. 

2.  Decide  on  a  suitable  ERT  binding  for  the  formal  parameter,  and  add  this  binding  to  the 
environment  In  our  example,  we  must  bind  «  to  the  proper  ERT.  and  add  this  binding  to 
the  environment.  (An  ERT  binding  for  f  will  already  be  in  the  environment) 

3  Evaluate  the  body  expression  to  an  ERT  in  this  new  environment.  In  our  example,  we 
must  evaluate  the  cn*ck- apply  to  an  ERT 

4  Using  the  ERT  that  resulted  from  evaluating  the  body  expression,  construct  and  return 
either  a  lambda  ERT.  if  the  body  is  ready  to  be  evaluated  to  some  basic  non-ERT  value  in 
the  next  phase:  or  a  chaca- lambda.  if  the  body  must  evaluate  to  another  ERT  in  the  next 
phase. 

Tnus.  m  our  example,  the  result  of  step  3  will  either  be  an  apply  ERT  (if  the  body  is  ready  to 
evaluate  to  a  numpar).  such  as  the  following  (call  this  eriA) 

.  eriA 

(apply  f  >)  .  (iprtssion 

((f  (fun  numtotr  nuro&t r ) ; }  RtQu i * t nv i ronm#nt 

numOtr  .  Typt 


■  r  j  c  nee » •  app  i  y  ERT  ( if  the  bi>d>  must  still  evaluate  to  another  F.RT i  such  as  the  follow  ing  (call  this 
i  nB) 


eriB 

1  c  s»c  » - app i y  f  i 

.  (  ’  ert  , 

•  r* 


t  «pr*ss ion 

PlUw  '  'td  ■  |nv  i  ro  nfn*  r  I 
typa 


1  he  I.'pc  component  of  cnA  indicates  that  it  >  read;'  to  evaluate  to  a  briber  whereas  the  Type 
component  of  enB  indicates  that  u  will  evaluate  to  .mother  F.RT 


SO 


d  a  a  a  »*’«•* »  ^ a  ^ »  *•*  .  ^  -f ,  «“ .  *  _  »*_  •*,  f  -  <,  #  ,  r  .  a',  *  .  •  .  •  ,  •  .  •  ,  ■  •  s  «  •  . 


Finally,  the  result  of  evaluating  the  check- lambda  (i.e.  the  result  of  step  4)  should  either  be  a  UmDda 
ERT.  such  as  the  following  (call  this  enl)\ 


(  ;  ertl 

(lambda  x  (apply  f  x)) 

((f  (fun  numbar  number))) 
(fun  number  number) 


Expression 

Repul red-envl ronment 
Type 


or  it  should  be  a  check- lambda  ERT.  such  as  the  following  (call  this  eri2): 

(  ;  en2 

(check-lambda  x  (eheck-apply  f  x))  ;  Expression 

((f  art))  ;  Required-environment 

art  ;  Type 

) 

That  is,  the  result  of  step  4  should  be  enl  if  the  result  of  step  3  is  ertA.  whereas  it  should  be  en2  if  the 
result  of  step  3  is  ertB.  That  much  is  straightforward.  The  difficulty  is  this:  What  ERT  binding  should 
we  provide  for  x  in  step  2? 

If  the  result  of  step  3  will  be  ertA,  then  we  should  supply  a  binding  of  x  to  the  ERT  ( x  ( x  number ; 
number)  in  step  2.  since  x  the  function  can  be  applied  to  a  number  in  the  next  phase,  and  hence  «  will  be 
bound  to  a  number  in  the  next  phase.  This  binding,  in  effect,  declares  x  to  be  type  number  m  the  next 
phase. 

On  the  other  hand,  if  the  result  of  step  3  will  be  ertB.  then  we  should  supply  a  binding  of »  to  the 
ERT  <x.  <x,ert>,  ert>  in  step  2.  since  x  another  cheek-appiy  will  be  evaluated  in  the  next  phase, 
and  hence  x  must  be  bound  to  another  art  in  the  next  phase.  This  binding,  in  effect,  declares  1  to  be 
t .  pc  art  in  the  next  phase. 


Here  is  the  dilemma.  Since  the  result  of  evaluating  the  body  in  step  3  will  in  general  depend  on 
factors  other  than  just  the  binding  of «  (in  this  case  it  also  depends  on  the  binding  of r  from  outside*, 
we  cannot  generally  know  which  binding  for  x  to  use  m  step  2  until  we  know  the  result  of  step  3. 

In  our  example,  twopotential  values  for  f  mat  would  cause  different  results  would  be  the  HRT: 


( 

f  txp-ession 

((f  (fun  number  number  !  i  Requtred-env i ronment 

(fun  number  number)  .  Tjpe 

) 


«1 
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ra 


and  the  ERT: 


( 

t 

<(f  ert)) 
ert 

) 


Expression 

Required -environment 
Type 


5.3.1.  Evaluating  the  Body  Twice 

One  wav  to  deal  with  this  dilemma  might  be  to  first  assume  that  the  check- lembde's  body  expression 
will  evaluate  to  an  ERT  that  will  be  ready  to  be  "fully"  evaluated  during  the  next  phase;  that  is.  first 
assume  that  step  3  will  evaluate  to  an  ERT  such  as  ertA .  in  which  the  Type  component  is  not  ert. 
Thus,  we  would  initially  bind  x  to  the  ERT  <x,<x. numbers. numbers.  If  the  Type  component  of  the 
result  of  step  3  turns  out  to  be  a  basic  (no-ERT)  type,  such  as  ertA.  then  all  is  well,  and  the  result  of 
cneck-  lambo*  should  be  a  lambde  ERT.  such  as  ertl.  However,  if  the  Type  component  turns  out  to  be 
ert.  such  as  m  enB.  then  we  bind  x  to  the  ERT  <x.<x,erts,erts  and  re-evaluate  the  body  as  in  step  3 
again 

5.3.2.  Evaluating  with  Both  Choices  at  Once 

A  more  efficient  solution  might  be  to  have  the  execution  of  eirprbody  generate  the  two  ERTs.  under 
both  assumptions.  But  what  happens  when  functions  are  nested'’  Must  4.  8.  etc  .  ERTs  be  generated' 
When  can  the  choices  be  eliminated'1 


5.3.3. 1  sing  an  Extra  Environment  Variable 

Since  the  difficult;,  seems  to  be  in  deciding  which  binding  to  use  for  the  formal  parameter  another 
povsibiiit.  might  be  for  the  D>namic-1L  Machine  to  use  an  extra  environment  variable  »hilc  3 
tnec>-  «noc«  bod;  is  being  executed,  indicating  the  names  of  an;  identifiers  that  are  bound  as  formal 
parameters  Those  identifiers  could  the:,  hr  treated  speciall;  bv  the  D;namicTL  Machine  This  might 
allow  ’he  bod;  to  be  evaluated  in  one  pass  But  notice,  then,  that  the  machine  would  essenualh  be 
plavinc  the  role  of  two  distinct  machines  depending  on  whether  this  extra  environment  variable  were 
set. 
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5.3.4.  The  Root  of  the  Problem 


The  root  of  the  problem  is  that  we  are  asking  the  Dynamic- IL  Machine  to  do  two  kinds  of  things 
during  the  same  phase:  to  partially  and  fully  evaluate  certain  subexpressions,  and  to  compile  certain 
subexpressions  for  future  partial  or  full  evaluation.  Section  5.4  proposes  another  solution. 

5.4.  A  Strongly  Typed  Phase  Compiler 

A  better  approach  to  the  problems  of  implementing  ch*ck- lambda  in  Dynamic-IL  might  be  to 
evaluate  programs  in  two  distinct  passes:  call  the  first  pass  phase  compilation  and  the  second  pass  phase  ■ 
evaluation.  As  before,  phase  evaluation  would  perform  both  partial  and  full  evaluation,  filling  the  roles 
of  traditional  compiletime  and  runtime. 

During  phase  compilation  every  expression  would  be  treated  symbolically:  none  of  the 
subexpressions  would  evaluate  to  a  final  (constant)  \alue  and  no  type  checking  would  be  done.  The 
purpose  of  phase  compilation  would  be  to  decide  which  subexpressions  can  be  fully  evaluated  and 
which  should  be  partially  evaluated.  The  resulung  program  will  have  these  decisions  syntactically  built 
into  it  (as  with  lambd*.  check- lambda,  etc.),  ready  for  phase  evaluation.  To  make  these  decisions,  the 
phase  compiler  must  know  which  of  the  program's  free  variables  are  to  be  given  final  (constant)  values 
during  phase  evaluation. 

Phase  evaluation  could  then  fully  evaluate  some  subexpressions  and  partially  evaluate  others.  The 
result  of  the  phase  evaluation  would  either  be  some  final  constant  or  another  program  (ERT),  but  ns 
type  would  be  known  beforehand.  The  key  to  this  approach  is  that  the  type  of  every  expression  is 
known  before  phase  evaluation.  Thus,  those  expressions  being  fully  evaluated  can  be  evaluated  just  as 
efficiently  as  on  a  conventional  (i.e.  fully  evaluating)  abstract  machine,  even  though  the  phase 
evaluation  machine  is  also  performing  partial  evaluation  for  a  strongly  typed  language. 


Chapter  6 

Remarks  About  Other  Language  Notions 


This  section  discusses  some  miscellaneous  language  notions,  and  shows  how  phases  are  relevant  to 
them  or  vice-versa. 

6.1.  Abstract  Data  Types 

This  section  discusses  one  possible  approach  to  providing  Abstract  Data  Types  (ADTs)  under  a 
model  of  phases.  The  purpose  of  this  section  is  to  demonstrate  some  of  the  usefulness  of  manipulating 
ERTs  under  a  model  of  phases.  The  ERT  data  type  makes  it  easy  to  talk  clearly  and  sensibly  about 
compiletime  notions  such  as  enforcing  the  information  hiding  needed  to  implement  ADTs. 

Our  notion  of  ADTs  is  intended  to  be  ordinary  --  corresponding  basically  to  Ada  packages,  for 
example  --  but  our  view  of  implementing  ADTs  is  somewhat  unusual,  and  is  mouvated  by  the  fact  that 
we  treat  types  and  code  (ERTs)  as  first-class  values.  That  is.  we  intend  to  provide  the  same  basic 
functionality  of  traditional  ADTs.  but  we  take  an  unusual  view  of  what  is  required  and  how  to  provide 
it.  In  effect,  this  discussion  treats  ADTs  from  a  compiler's  point  of  view,  since  phases  fill  the  role  of 
traditional  compile-time. 

Wc  consider  a  newly  defined  ADT  to  be  essentially  a  unique  type  and  a  sc’  cf  operations  that  are 
privileged  to  operate  on  values  of  that  type.  As  in  Ada.  wc  assume  that  an  ADT  has  no  separate 
functional  or  behavioral  specification:  its  intended  behavior  is  defined  only  by  its  implemeniauon 
I  he  prog-ammer  defines  a  new  ADT  in  terms  of  other  types  and  operations,  thus  supplying  an 
implemeniauon  for  it.  The  implemeniauon  should  be  V.dden  from  the  user:  this  information  hiding 
should  he  enforced  by  the  language. 


ADTs  in  terms  of  what  a  compiler  must  do  to  enforce  the  required  information  hiding.  To  focus  only 
on  the  essential  elements,  we  do  not  address  scoping  rules  or  other  extraneous  issues  such  as  providing 
separate  declarations  of  ADT  headers  and  bodies,  as  is  allowed  in  Ada. 


6.1.1.  Four  Essential  Functions 

Let  us  personify  the  portion  of  the  program  that  implements  the  ADT  as  the  implementor,  and  the 
portion  of  the  program  that  uses  the  ADT  as  the  user :  To  employ  the  canonical  example,  we  might 
define  an  ADT  called  stack,  offering  only  push  and  pop  functions  for  accessing  values  of  type  stack, 
and  use  an  array  to  implement  the  stack.27  The  compiler,  then,  must  ensure  that  the  implementauon 
of  type  stack  as  an  array  is  hidden  from  the  user,  but  is  available  to  the  stack  implementor. 

The  stack  implementor,  then,  must  be  privileged  to  perform  two  essential  acts:  to  create  a  value  of 
the  ADT  from  a  value  of  the  implementing  type,  for  example,  creating  a  stack  value  from  an  array 
value;  and  to  view  a  value  of  the  ADT  as  a  value  of  the  implementing  type,  for  example  vie*  ing  a 
stack  value  as  an  array.  Less  obviously,  though,  in  a  language  in  which  types  are  first-class  values,  the 
implementor  of  the  stack  ADT  must  also  be  privileged  to  perform  two  addiuonal  essenual  acts:  to 
create  the  type  value  stack  from  the  implementing  type  value  array;  and  to  view  the  type  value  stack  as 
the  type  value  array. 

in 

All  four  of  these  privileged  acts  are  compiletime  sleights  of  hand  ••  thev  are  functions  involving 
types  that  are  computed  at  compiletime.  Recall  that  during  "compiletime"  (a  relauve  termi  type  and 
ert  values  are  manipulated,  and  that  in  our  model  a  phase  fills  the  role  of  traditional  compiletime. 
Thus,  to  create  a  stack  ADT.  we  need  the  following  four  functions. 

3bs-stack:  ert  -*  ert 

For  creating  values  of  type  stack  from  values  of  type  array  Note  that  this  function 
takes  an  ERT  value  and  returns  an  ERT  value  --  it  docs  not  take  an  arrav  value  and 
return  a  stack  value.  Rather,  it  takes  an  expression  (an  ERT)  that  will  evaluate  to  an 


L'suallv  a  suck  would  be  implemented  bv  a  pair  consisting  of  an  arra>  and  an  integer  with  the  integer  ^presenting  a 
pointer  to  the  current  suck  top  Thus  deuil  is  irrelevant  here  and  »e  are  ignoring  i:  for  the  sake  of  simplicity. 

ig 

"  rasa!''  end  function  tr  probably  the  her:  known  evam.p.r  of ;  "csmp::c::.T.e  sleigh:  of  band'  For  an.',  scalar  -.pe  the  :  -J 
function  returns  an  integer  representing  the  argument  s  uhc  ordinal  value  Ord  is  almost  universal!'  implemented  :r,  the  compiler 
s.mplv  by  viewing  the  binary  represcnution  of  its  argument  a  value  of  a  different  type  For  example  in  a  Pasca.  s' stem  in 
which  the  ASCII  character  set  is  used,  the  value  of  ord(  X  )  would  be  88  That  is.  the  binan  \alue  1011000  is  simph 
interpreted  as  representing  an  integer  (88)  instead  of  representing  the  ASCII  character  X  Because  the  binan  represcr.uiions 
of  these  values  are  the  same  the  compiler  does  not  emu  am  code  io  implement  the  ord  funcuon  Although  the  compiler  would 
view  the  expression  X  as  producing  a  value  of  tv  pc  char  and  the  expression  ord('X')  as  producing  a  value  of :  pe  integer 
the  code  generated  for  these  two  e' pressions  would  be absolutely  identical 


array  value,  and  returns  an  expression  (an  ERT)  that  will  evaluate  to  a  stack  \alue. 
Only  the  Type  component  of  the  argument  ERT  and  the  result  ERT  will  differ. 


imp-stack:  ert  -*•  ert 

For  viewing  values  of  type  stack  as  values  of  type  array.  Again,  this  function  takes 
an  expression  (an  ERT)  that  will  evaluate  to  a  stack  and  returns  an  expression  that 
will  evaluate  to  an  array. 


tvpe-abs-stack:  type  -*  type 

For  creating  a  stack  type  value  from  an  array  type  value.  Note  that  this  is  a  function 
that  takes  a  type  and  returns  a  type.  The  type  value  that  is  supplied  as  the  actual 
parameter  must  be  an  array  type  value:  a  stack  type  value  will  be  returned. 


type-imp-stack:  type  -*  type 

For  viewing  stack  type  values  as  array  type  values.  Note  that  this  is  a  function  that 
takes  a  type  and  returns  a  type.  The  type  value  that  is  supplied  as  the  actual 
parameter  must  be  a  stack  type  value:  an  array  type  value  will  be  returned 


6.1.2.  Type  Values:  <.tag,value>  Pairs 


Let  us  now  assume  that  type  values  are  represented  as  <tag.\aiue>  pairs 


The  lag  component  is  a  svmbol  idenufying  the  tvpe:  for  example,  it  might  he  the  svmN.il  arrav 
representing  any  array  type,  h  is  the  same  for  everv  artay  tvpe. 


Tic  .j.'ue  component  holds  other  information  about  that  particular  tvpe.  for  example  it  might 
include  information  about  the  array  <•  element  t.pe  and  si/c  (if  the  sire  is  a  par,  o-  the  tvpe  as  it  is  ,r 
Pascal,  for  example  I  The  -.j.W  component  will  general),  be  different  for  diffe-er:  arras  f.pes  1  or 
proBrammer-dcfined  A I » I  s  ;•  w  il'  alwavs  he  a  -vpe 


6.1.3.  T>pe*of  and  Tag-of 


•  ....  '.*p»escrtat.-.fo  ’  tvpe*  and  ert*  •  ..w  -  v  '  uue-  •*.,:  j.  icrnm.  the  1 

.or.p.  non:  "t  .m  PH  I  •  -r  the  lag  mp.  nen;  r  a  t . pc  *..-.t.i-nv  txpe-of  ar.d  tag-of  ..ir  N  pro. idee 

rvpc-of  ert  -•  type 


tag-of  type  —  symbol 


It  the  rcp-cscniatmrv  PK  i  *  and  i> pc*  *vii  •  •*■  -  c  tnev.*  functions  would  he  defit  ed  simplv 
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6.1.4.  Implementations  of  the  Four  Essential  Functions 

Hie  four  essential  functions  for  a  stack  ADT  can  be  roughly  defined  as  follows.  Bear  in  mind  that 

types  type  and  ert  should  also  be  ADTs,  and  the  programmer  would  not  have  indiscriminate  access  to 

their  representauons.  However,  for  expository  purposes,  the  functions  defined  below  show  the 

representations  of  types  as  pairs  and  erts  as  triplets.  (Error,  below,  represents  a  "compiletime"  error 

condition  indicating  that  the  programmer  tned  to  use  one  of  these  stack  functions  to  convert  between 

something  other  than  a  stack  and  an  array.) 

( tbs  -  stack  <#,r,<t.v>>  )  • 
if  t  ■  trrty 

tn#n  < • . r  .  < • s t«c k . < t . v > > >  --  Save  the  implementing  type, 
else  error 

(  imp-  sue*  <e.r,<t  ,v>>  )  • 
if  t  ■  stack 

then  <e,r,v>  --  Restore  the  implementing  type 

else  error 

!  type-aos-stack  <t.v>  )  * 
if  t  *  'array 

then  «'  stack  .  <t , v>>  --  Save  the  implementing  type 

else  error 

( type- imp-stack  «'e,r.<t.v>>  )  * 
if  t  ■  stack 

then  v  --  Restore  the  implementing  type 

else  error 


I  he  relationships  betw  ccn  these  four  functions  are  illustrated  in  Figure  6-1. 

6.1.5.  Defining  a  New  Abstract  Data  Type 

lo  allow  a  new  abstract  data  type  to  be  defined,  the  language  needs  to  supply  a  function  that  will 
create  and  return  the  four  functions  described  above,  with  a  new  uniquely  generated  tag  embedded  in 
them  Hie  new  lag  should  uImj  he  returned,  to  allow  the  programmer  to  test  lor  this  new  tvpe  wnhi  ut 
incurring  an  error. 
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Figure  6-1: 

Implementing  Abstract  Data  Types 
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6.2.  Dependent  Types 

Dependent  types,  for  example,  of  Pebble  [Burstali  84],  are  compound  types  in  which  the  type  of  one 
element  depends  on  the  value  of  another  element.  For  example,  in  Pebble  a  dependent  type  is  used  to 
express  the  type  of  a  polymorphic  pair-sw  apping  function: 

(tlrtype  X  t2:type)  -*>  (tlXt2  -*  t2Xtl) 

The  symbol  is  similar  to  except  that  bound  variables  appear  on  the  left  and  may  be  used 
on  the  right  to  refer  to  their  values.  This  polymorphic  swapping  function  is  actually  a  function  that 
returns  a  function:  it  is  first  given  the  types  of  the  elements  to  swap,  and  the  result  is  then  a  swapping 
function,  specific  to  those  types,  that  may  be  applied  to  an  actual  pair  of  elements.  It  swaps  and  returns 
the  first  and  second  elements  of  the  pair.  Thus,  for  example,  if  we  wish  to  swap  [int,  bool)  pairs, 
returning  [bool,  int]  pairs,  instantiating  swap  for  these  types  would  yield  a  function  value  of  type 
intXbool  -*  boolXint: 

swap[int.bool):  intXbool  -*  boolXint 

Dependent  types  seem  to  have  arisen  mainly  from  the  desire  to  assign  sensible  types  to  all 
expressions,  yet  also  be  able  to  parameterize  something  by  a  type,  such  as  a  polymorphic  function; 
manipulate  type-tagged  values  at  runtime:  and  define  recursive  types. 


«£<>< 

>**>> 


Phi  does  not  offer  dependent  types,  but  some  of  the  same  functionality  could  be  obtained  in  other 
ways,  as  described  below. 

In  Static-Phi.  arbitrary  type  expressions  can  be  evaluated  at  compiletime,  and  polymorphic  functions 
or  data  structures  can  be  instantiated  to  particular  types.  In  Pebble  [Burstali  84],  one  is  unable  to  talk 
about  a  function's  actual  parameter  w  ithout  dealing  with  the  parameter's  runtime  value.  But  in  Static- 
Phi.  a  function's  actual  parameter  is  an  ERT  \alue  during  the  phase  before  the  function  is  applied,  so 
the  Type  component  can  meaningfully  be  extracted  and  used  at  that  time.  This  also  means  that  a 
polymorphic  function  need  not  have  an  extra  explicit  type  parameter. 

Dependent  types  also  allow  type-tagged  values  to  be  manipulated  at  runtime,  and  this  may  be  a 
desirable  capability  to  provide.  This  can  be  accomplished  by  providing  a  type  any  --  a  variable  of  type 
any  could  hold  a  value  of  any  other  type,  tagged  with  the  value's  type.  A  case  conformity  clause  can  be 
used  to  query1  the  variable's  current  type  and  access  its  value  while  retaining  strong  typing.  (This  is 
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essentially  the  way  Algol*68  [Lindsey  71]  provides  union  types.)  Note  that  the  current  work  of  Gifford. 
Schooler,  et  al.  [Schooler  84]  takes  a  very  attractive  approach  to  this:  where  possible,  they  do  type 
checking  before  runtime;  if  a  type  cannot  be  determined  before  runtime,  dynamic  type  checking  is 
used. 

Recursive  types  are  discussed  in  Section  7.2.7. 

6.3.  Type  Checking  Recursive  Functions 

The  question  usually  arises:  "Is  there  any  special  difficulty  in  type  checking  recursive  functions?" 
Not  when  the  function’s  parameter  and  return  types  are  declared.  In  fact,  the  type  checking  is  very 
similar  to  the  non-recursive  case,  even  when  types  are  allowed  as  first-class  values. 

The  idea  of  a  recursive  function  is  that  the  function’s  name  can  be  used  inside  the  function  body . 
The  only  impact  this  has  on  type  checking  is  that  the  function's  type  must  be  known  inside  the  body. 
This  is  easy  to  arrange,  because  the  function's  type  is  known  from  parameter  and  return  type 
declarations. 

Compare  the  type  checking  required  for  a  non-recursive  let  construct  versus  a  recursive  letrec 
construct.  The  two  constructs  would  be: 

!et  id:  expr^  =  expr ,alu,  in  expr^ 

letrec  id:  expr ^  =  expr wt1(|<  in  expr^ 

In  each  case.  f-*Prtype  gives  the  type  of  the  bound  variable  id.  Call  this  type  i.  The  only  difference  in 
ty  pe  checking  the  two  constructs  is  that  in  ty  pe  checking  exPrwt}u^  for  letrec.  id  is  known  to  be  ty  pe  i. 
rather  than  whatever  type  it  may  have  been  declared  to  be  in  the  surrounding  scope.  This  is  true  even 
if  the  function  happens  to  construct  a  type  value.'9 


-V  mentioned  in  Section  3  3  2  the  t>pe  that  is  constructed  can  be  used  as  an  invariant  of  the  next  phase  (t.e  it  can  be  used  ir. 
a  declaration  pertaining  to  the  next  phase)  but  it  cannot  be  used  as  an  insanant  of  the  phase  during  which  it  is  computed 


Chapter  7 

Conclusions  and  Future  Work 


7.1.  Conclusions 


This  work  has  addressed  the  basic  question  of  whether  types  and  code  can  be  manipulated  as 
first-class  values  while  retaining  strong  typing.  We  demonstrated  how  this  can  be  done  by  introducing 
the  notion  of  multiple  strongly  typed  evaluation  phases.  In  the  simplest  case,  two  phases  correspond  to 
the  traditional  notions  of  compiletime  and  runtime,  though  a  single  machine  is  used  for  both.  In 
general,  multiple  phases  may  be  used,  and  each  phase  acts  as  compiletime  relative  to  the  next  phase,  or 
runtime  relative  to  the  previous  phase.  Types  that  are  freely  manipulated  as  first-class  values  during 
one  phase  become  invariants  of  the  next  phase,  thus  guaranteeing  that  the  next  phase  is  strongly  typed. 


One  benefit  of  allowing  types  and  code  to  be  manipulated  as  first-class  values  under  the  model  of 
phases  is  that  the  same  abstract  machine  can  be  used  to  both  compile  and  run  the  program.  This 
means  that  all  of  the  features  that  are  available  in  the  language  at  runtime  are  also  available  at 
compiletime.  The  features  only  need  to  be  implemented  once  in  the  single  machine,  and  they  are  thus 
guaranteed  to  have  the  same  semantics  at  compiletime  and  runtime.  Thus,  for  example,  constant 
expressions  can  be  evaluated  at  compiletime  using  the  same  efficient  ev  aluation  mechanism  as  is  used 
at  runtime,  whereas,  in  general,  a  conventional  compiler  must  simulate  the  action  of  the  runtime 
machine  in  evaluating  constant  expressions.  The  single  machine  is  therefore  inherently  "efficient”  in 
two  respects:  (1)  for  runtime  operations,  it  can  have  the  same  efficiency  as  a  conventional  machine  in 
evaluating  untyped  lambda  calculus  expressions,  even  though  it  has  the  additional  capability  of 
performing  compiletime  operations:  and  (2)  for  compiletime  operations  it  can  be  much  more  efficient 
than  a  conventional  compiler,  because  compiletime  tasks  that  can  already  be  performed  at  runtime, 
such  as  evaluating  constant  expressions,  are  executed  directly  rather  than  being  simulated. 


The  special  abstract  data  type  ERT  is  essential  to  constricting  and  manipulating  code  fragments  as 
first-class  values,  while  capturing  all  information  necessary  to  ensure  that  any  code  generated  in  this 
manner  will  be  strongly  typed.  The  ERT  data  type  makes  it  possible  to  use  the  same  abstract  machine 
to  do  the  compiletime  operations  of  type  checking  and  code  generation,  as  well  as  conventional 
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runtime  operations.  The  ERT  data  type  also  makes  it  easy  to  talk  sensibly  about  compiletime  notions 
such  as  asking  for  the  type  of  an  expression,  or  dealing  with  the  type  conversions  involved  in 
implementing  abstract  data  types. 


The  notion  of  phases,  with  its  ERT  data  type  and  uniform  treatment  of  compiletime  and  runtime, 
gives  insight  into  the  semantic  processing  that  occurs  during  compiletime  and  runtime.  It  also  gives 
insight  into  how  to  efficiently  implement  compiletime  notions  such  as  type  checking,  using  runtime 
machinery,  and  how  to  efficiently  provide  runtime  notions  at  compiletime. 

We  believe  that  the  notions  of  strong  typing,  types  as  first-class  values,  and  partial  or  phase 
evaluation  complement  each  other  handsomely  in  providing  a  language  basis  for  writing  more 
reusable,  correct,  and  efficient  software:  "reusable"  because  types  can  be  manipulated  as  first-class 
values,  and  because  of  the  ability  to  construct  new  strongly-typed  programs  with  phases  or  specialize 
programs  with  partial  evaluation;  "correct"  because  of  strong  typing;  and  "efficient"  because  of  the 
ability'  to  perform  much  of  the  computation  before  runtime. 

The  following  sections  outline  some  suggested  future  work. 

7.2.  Subjects  for  Further  Study 

7.2.1.  Developing  a  Practical  Language  Based  on  Static-Phi  and  Static-IL 

The  particular  model  of  phases  embodied  in  the  Static-Phi  and  Static-IL  languages  of  Chapter  4  are 
based  on  typed  and  untyped  versions  of  the  lambda  calculus,  and  were  presented  as  purely  pedagogical 
languages.  It  would  be  reasonably  straightforward  to  expand  these  into  useful  real-life  functional 
languages  with  a  full  complement  of  data  types  and  operators. 

7.2.2.  Using  Phases  for  Partial  Evaluation 

This  was  discussed  in  Chapter  5. 


7.2.3.  Constructing  and  Maintaining  Environments 


More  work  is  needed  on  how  to  effectively  generate  and  manipulate  the  environment  required  for 
each  phase.  This  comes  in  the  larger  context  of  programming  methodology. 


7.2.4.  Determining  the  Source  of  a  Bug 


Suppose  a  bug  is  discovered.  Where  did  it  originate?  During  what  phase?  To  some  extent,  the 
difficulty  of  determining  the  origin  of  a  bug  becomes  inherently  more  difficult  with  more  reusable 
software,  in  the  following  sense.  When  a  program  is  constructed  from  several  pieces  of  different 
origins,  it  may  be  more  difficult  to  know  which  piece  of  the  program  is  at  fault  when  a  bug  is 
discovered.  On  the  other  hand,  if  a  standard  set  of  reusable  software  components  are  provided  they 
can  be  very  thoroughly  debugged.  Overall,  we  do  not  know  whether  multiple  phases  will  make 
debugging  significantly  more  or  less  difficult. 


7.2.5.  Universal  Polymorphism 


A  function  is  polymorphic  if  different  parameter  types  may  be  used  in  different  invocations.  Burstall 
and  Lampson  [Burstall  84]  distinguish  between  two  kinds  of  polymorphism  (attributing  the  distinction 
to  C.  Strachey  [Strachey  67]): 


Ad  hoc  [or  Generic]  polymorphism 

The  code  executed  depends  on  the  type  of  the  argument,  e.g.,  print  3‘ 
involves  different  code  from  ’print  "nonsense"’. 


Universal  [or  Parametric]  polymorphism 

The  same  code  is  executed  regardless  of  the  type  of  the  argument 
since  the  different  types  of  data  have  uniform  representation,  e.g. 
reverse?  1, 2. 3, 4)  and  reverse? true.falsejalse). 


Ad  hoc  polymorphism  is  the  natural  form  of  polymorphism  under  phases.  Universal  polymorphism 
seems  to  require  something  additional.  The  basic  difficulty  is  that,  in  type  checking  the  call  of  a 
universally  polymorphic  function,  such  as  reverse  (above),  different  result  ty  pes  should  be  returned  for 
calls  using  different  actual  parameter  types,  even  though  the  same  function  will  be  called  at  runtime. 
Furthermore,  a  mechanism  for  type  checking  the  function  body  once,  independent  of  call  types,  should 
be  provided. 


ML  [Gordon  79]  uses  unification  in  type  checking  polymorphic  functions.  Unification  involves 
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having  the  language  processor  perform  substantial  computations  involving  types.  This  approach  might 

/ 

be  used  here,  although  it  would  seem  to  be  somewhat  contrary  to  the  underlying  philosophy  of  having 
types  of  expressions  simply  computed  rather  than  inferred  by  a  more  complex  language  processor,  it 
would  be  most  attractive  to  use  an  approach  that  takes  advantage  of  a  language's  existing  ability  to 
explicitly  manipulate  types  as  first-class  values,  as  the  Static-Phi  language  does,  rather  than  adding  type 
inference  machinery-  to  the  language  processor.  We  do  not  know  how  best  to  do  this. 

7.2.6.  Inferring  Types 

As  explained  in  Section  1.3.1,  this  work  was  motivated  by  a  bias  toward  expressing  rather  than 
inferring.  However,  much  notable  work  on  data  types,  such  as  ML  [Gordon  79]  has  involved  type 
inference.  It  would  be  good  to  explore  the  relationship  betw-een  type  inference  systems  and  our  model 
of  multiple  phases,  which  is  based  on  types  being  computed  directly.  Maybe  a  hybrid  would  be 
feasible. 

7.2.7.  Recursive  Types 

A  recursive  type  is  a  type  defined  in  terms  of  itself.  Recursive  types  are  most  often  used  in  defining 
lists,  sequences,  or  trees  of  unbounded  size.  The  problem  of  representing  recursive  types  is  similar  to 
the  problem  of  representing  recursive  function  values  or  any  other  infinite  structure.  The  basic 
problem  is  how  to  represent  the  infinite  structure  in  finite  space  and  time  while  providing  convenient 
mechanisms  for  manipulating  and  comparing  values  of  the  infinite  structure.  There  are  several  ways 
recursive  types  might  be  implemented  in  Phi. 


Clearly,  some  kind  of  delay  mechanism  is  needed  to  avoid  going  into  an  infinite  loop  in  trying  to 
evaluate  (list-of  t)  in  the  example  above.31  Function  abstraction  generally  provides  a  kind  of  quoting 
that  delays  evaluation  of  the  function  body  until  the  function  is  invoked,  rather  than  evaluating  the 
body  when  the  function  value  (closure)  is  created. 


Now  compare  the  following: 


V  /,  (element-type-of  (list-of  /))  =  t 


V  t,  (apply  (lambda  0  00)  =  ' 


The  list-of  operation  creates  a  list  type  where  the  elements  must  be  type  t,  and  element-tvpe-of  returns  a 
list  type’s  element  type.  Note  that  these  operations  deal  with  type  values  --  they  do  not  create  or 
examine  list  values.  Lambda  (with  an  empty  formal  parameter  list,  in  this  case)  creates  a  function 
abstraction,  and  apply  applies  the  function  abstraction  (to  an  empty  actual  parameter  list,  in  this  case), 
as  in  LISP  [McCarthy  66].  The  operation  list-of  is  analogous  to  function  abstraction,  and  the  operation 
element-type-of  is  analogous  to  function  application. 


The  example  above  showed  that  there  is  an  analogy  between  function  abstraction  and  the  kind  of 
delay  mechanism  needed  to  allow  recursive  type  definitions.  Could  function  abstraction  be  used  to 
implement  recursive  types?  Certain  type  operations,  such  as  list-of.  might  act  as  function  abstractions, 
and  one  of  these  would  have  to  enclose  each  appearance  of  the  type  name  being  recursively  defined. 
(Note  that  this  corresponds  to  the  Algol-68  or  Pascal  rules  for  defining  recursive  types,  in  which  an 
intervening  reference  or  pointer  type  must  be  used  in  any  recursive  type  definition.)  Other  type 
operations,  such  as  element-type-of.  would  act  as  function  application,  forcing  the  element  ty  pe  of  the 
list  to  be  computed,  just  as  function  application  causes  the  function  body  to  be  evaluated. 


This  approach  has  not  been  worked  out  for  Phi.  We  do  not  know  if  it  would  be  feasible  or  practical. 


‘  P  Z  Ingermans  Thunk,  used  to  implement  caii-bvname  parameter  passing  in  Algol  60  is  the  classic  example  of  a  deiav 
mechanism  [Pratt  75]  Laz>  evaluauon  [Henderson  80]  is  another  technique 
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7.2.8.  ERT  Subtypes 


One  unsatisfactory  aspect  of  ERT  triplets  <e,r.i>  is  that  if  the  type  component  is  ert  (indicating  that 
the  expression  will  evaluate  to  an  ERT  in  the  next  phase),  there  is  no  further  information  about  what 
type  of  value  might  be  computed  in  the  following  phase.  In  fact,  the  Static- IL  construct  deep -const 
hides  the  types  of  constants  until  the  phase  before  the  constant  will  be  used,  thus  preventing  any 
compiletime  type  errors  regarding  that  constant  from  being  detected  earlier.  It  would  certainly  be 
better  to  detect  all  errors  as  early  as  possible. 

One  way  to  support  earlier  error  detection  might  be  to  introduce  subtypes  of  the  ert  type  that 
provide  some  information  about  an  expression’s  final  type,  if  known.  Consider  the  following  Static-Phi 
program. 

X  x :  tl  -*  t2 .  (g  x) 

Recall  from  Section  3.3.4  that  every  subexpression  starts  out  being  type  ert;  thus  the  X  expression 
above  will  initially  be  considered  type  ert.  But  regardless  of  what  types  tl  and  t2  turn  out  to  be.  it  is 
syntactically  obvious  that  the  above  expression  will  eventually  evaluate  to  some  kind  of  function  value. 
Hence,  it  might  be  useful  to  initially  consider  the  expression  to  be  a  type  that  is  a  subtype  of  ert.  such  as 
"ert  of  fun”,  which  carries  more  information  than  the  simple  ert  type  carries.  Similarly,  if  tl  and  t2 
happen  to  be  type  constants  such,  as  number,  an  even  more  specific  subtype  might  be  returned,  such  as 
"ert  of  <fun  number  number>",  which  represents  the  type  of  an  expression  that  will  become  a  function 
from  numbers  to  numbers  in  some  future  phase. 

We  do  not  know  whether  ERT  subtypes  will  provide  the  right  practical  mechanism  for  early  error 
detection,  or  whether  some  other  approach  would  be  better. 

7.2.9.  Statically  Inferred  Phases 

In  Static-Phi.  phases  are  assigned  statically  by  the  Translator,  based  on  emit  and  eval  constructs 
explicitly  embedded  in  the  Static-Phi  program.  To  ease  the  programmer's  burden,  it  might  be  possible 
to  have  the  Phi  Translator  automatically  determine  which  subcomputations  should  be  performed 
dunng  which  phases,  without  requiring  the  programmer  to  designate  them  explicitly. 
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Appendix  A: 

Formal  Semantics  of  Static-Phi  and  Static-IL 


Introduction 

This  section  gives  a  semantics  for  phase  evaluation  of  Static-Phi  expressions.  The  semantics  of  a 
Static-Phi  expression  are  given  by  two  sets  of  semantic  equations:  one  set  of  equations  corresponds 
to  translating  the  Static-Phi  expression  into  a  Static-IL  expression  (in  the  Expression  component  of 
an  ERT);  the  other  set  corresponds  to  evaluating  a  Stauc-IL  expression.  This  is  therefore  an 
operational  semantics,  though  we  write  it  in  a  denotational  style  using  continuations. 

Static-Phi  Syntax  Domains 


id  €  ID 
b  €  BOOLEAN 
n  €  NUMBER 
t  6  BTYPE 
e  €  EXPR 


Identifiers. 

Booleans. 

Numbers. 

Basic  type  constants. 
Static -Phi  expressions. 


A  program  is  an  EXPR 


Static-Phi  Syntax  Equations 


ID 


—  Identifiers. 


BOOLEAN-  true,  fait* 


—  Booleans. 


NUMBER-  0,  1,  2,  ... 


—  Numbers. 


BTYPE-  boolean,  number,  type,  ert 

—  Basic  (non-function)  type  constants. 


EXPR  =  ID 

+  BOOLEAN 
+  NUMBER 
+  BTYPE 

+  (funtype  EXPR  EXPR 
+  (emit  EXPR  ) 

+  (eval  EXPR  ) 

♦  \  ID  :  EXPR  -*  EXPR 

♦  (  EXPR  EXPR  ) 


—  Identifier 

—  Boolean  constant 

—  Number  constant 

—  Basic  (non-function)  type  constant 

—  For  expressing  the  types  of  functions 

—  Normal  runtime  phase  is  one  phase  later 

—  Normal  runtime  phase  is  one  phase  earlier 

EXPR  —  Abstraction,  with  parameter,  return  tvpes 

—  Function  application 


Static-IL  (Semantic)  Domains 

These  domains  are  best  interpreted  as  semantic  domains,  though  they  are  sometimes  used  as  though 
they  were  syntactic  domains.  The  reason  for  this  is  to  avoid  having  to  deal  with  the  cumbersome 
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detail  of  two  parallel  domains  —  one  syntactic  and  one  semantic  —  and  a  trivial  semantic 
correspondence  between  them. 


id  €  ID 

b  €  BOOLEAN 
m.n  €  NUMBER 
t  €  TYPE 
r  €  RENV 
e  €  EXPR 
ert  €  ERT 


env  €  ENV 
v  €  EV 
k  €  ECONT 
err  €  ERROR 


—  Identifiers.  Same  domain  as  in  Static-Phi. 

—  Booleans.  Same  domain  as  in  Static-Phi. 

—  Numbers.  Same  domain  as  in  Static-Phi. 

—  Types.  Includes  both  basic  types  and  function  types. 

—  Required-environment.  Lists  identifiers  and  their  types. 

—  Static-IL  expressions.  A  program  is  an  EXPR. 

—  Triplet  <  e.  r.  t  >:  e  is  an  expression,  r  is  a  list  of 
identifier-type  pairs,  and  t  is  a  type.  We  deal  only  with  a 
restricted  set  of  ERT  triplets,  for  which  r  lists  all  free 
variables  and  their  types  in  e,  and  e  is  guaranteed  to 
evaluate  to  a  value  of  type  t  (or  some  error  condition). 

—  Environments 

—  Expressible  values. 

--  Expression  continuations. 

—  "Compiletime"  errors  of  various  kinds. 


Static-IL  Domain  Equations 

ID  =  ID  —  Identifiers  from  Static-IL. 


BOOLEAN  =  BOOLEAN 
NUMBER  =  NUMBER 


—  Booleans  from  Static-IL. 

—  Numbers  from  Static-IL. 


FTYPE  =  (fun)  x  TYPE  x  TYPE  —  Function  type:  domain,  range. 


TYPE  s  BTYPE 
+  FTYPE 


—  Basic  types, 

—  Function  types. 


RENV  =  {o} 

+  (  ID  x  TYPE  )  x  RENV 


—  Required-environment. 

(Identifier,  type  pairs  )  Note  that 
required-environments  are  represented 
slightly  differently  in  this  appendix  than  in 
the  body  of  this  work. 


EXPR  =  ID  —  Static-IL  expressions. 

+  (quota  EV  ) 

+  (incr  EXPR  ) 

+  (chock-funtypo  EXPR  EXPR  NUMBER  ) 

+  (funtypo  EXPR  EXPR  ) 

+  (chock-chock-lambda  ID  EXPR  EXPR  EXPR  NUMBER  ) 

+  (chock-lambda  ID  EXPR  EXPR  EXPR  ) 

+  (lambda  ID  EXPR  ) 

+  (chock-appty  EXPR  EXPR  ) 

+  (apply  EXPR  EXPR  ) 

+  (doop-contt  EV  TYPE  NUMBER  ) 

ERT  *  EXPR  x  RENV  x  TYPE  —  ERT  triplet.  Free  variables  of  EXPR  are 

listed  in  RENV  with  their  types.  EXPR 
evaluates  to  a  value  of  type  TYPE. 

CLOSURE  =  ID  x  EXPR  x  ENV  —  Function  closures 

EV  =  BOOLEAN  —  Expressible  values 

+  NUMBER 
+  TYPE 
+  ERT 
+  CLOSURE 

ENV  =  ID  -»  EV  —  An  environment  is  a  function  from 

identifiers  to  expressible  values.  Note  that 
environments  are  represented  slightly 
differently  in  this  appendix  than  in  the 
body  of  this  work. 

ECONT  =  EV  — ♦  (  EV  +  ERROR  ]  —  An  expression  continuation. 

ERROR  =  {  orror-non-typo,  —  "Compiletime"  errors  possible 

orror-non-function, 
orror-inconsistont-roq-onva, 
orror-typo-miamatch , 
orror-difforont-typo-uaod-in-body, 
orror-ort-oxpoc  tod , 
orror-non-ort, 
orror-body-ia-not-ort, 
orror-body-and-rango-typoa-diffor, 
orror-arg-roady-baforo-function  } 


Meta-Language  Notation 

The  translation  rules  and  semantic  equations  will  use  a  meta-language  including  if,  where,  let  and 
maximum  constructs.  They  are  written  in  this  font.  Comments  on  a  line  are  preceded  by  " — 
Tuples  are  written,  for  example,  as  "<a,  b>".  Function  application  is  written,  for  example,  as  "(f 
x)".  The  continuation-style  operator  also  denotes  function  application,  but  it  is  right  associative 
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and  binds  weakly.  Thus,  "f:  g;  x”  means  " (f  (g  x))\  The  body  of  a  lambda  abstraction  "k  x  .  ...” 
extends  as  far  to  the  right  as  possible. 

We  use  the  operators  "+"  and  ”-"  to  concatenate  two  Required-environments,  and  to  remove  all 
occurrences  of  an  identifier  from  a  Required-environment.  We  use  the  notation  env[v/id]  to 
denote  the  environment  env  augmented  by  the  binding  of  v  to  id.  (Remember  that  a 
"Required-environment”  is  not  the  same  as  an  "environment”!)  More  formally,  we  can  recursively 
define  these  operations: 

a 

rl  +  r2  =  if  rl  =  o 
then  r2 

else  let  «id1  ,t1  >,r1  *>  =  rl 
in  «id1  ,t1>,  rl '  +  r2> 

A 

r  -  id  r:  if  r  =  <>  then  r 

else  let  «id1.t1>,r*>  =  r 

in  if  id  1  -  id  then  r'  -  id 
else  «idl  ,t1>,  r’  -  id> 

A 

env[v/id]  ~  A.  idl  .  if  id  1  -  id  then  v  else  env(idl) 

(Note  that  the  equality  used  here  between  identifiers  is  true  iff  the  two  identifiers  are  same 
identifier  —  it  has  nothing  to  do  with  the  values  of  those  identifiers.) 

Translating  from  Static-Phi  to  Static-IL 

A  Static-Phi  expression  is  not  evaluated  directly.  Instead,  it  is  first  translated  to  a  corresponding 
Static-IL  expression  (contained  in  an  ERT).  which  is  in  turn  evaluated  through  one  or  more  phases. 

The  function  Trans-count  is  applied  to  the  Static-Phi  program  and  produces  the  Static-IL 
translation  by  calling  auxiliary  functions  Trans  and  Count.  These  functions  have  the  following  types: 

Trans-count:  EXPR  — *  ERT 
Count:  EXPR  X  NUMBER  -»  NUMBER 
Trans:  EXPR  X  NUMBER  -*  ERT 

The  TYPE  component  of  the  ERT  that  Trans  or  Trans-count  returns  will  always  be  *rt.  the  EXPR 
component  will  be  the  Stauc-IL  expression  corresponding  to  EXPR.  and  the  RENV  component  will 
list  all  the  free  variables  appearing  in  that  EXPR  component.  Each  variable  is  initially  type  •!*. 
Trans-count  is  simply  defined  as  follows: 

Trans-count  [  e  ]  -  Trans  [  e .  Count  [  e .  0  ]  ] 

Function  Count  is  used  to  count  the  depth  of  the  minimum  number  of  phases  required,  and  Trans 
does  the  real  translation  work.  These  functions  are  defined  below. 


Auxiliary  Function  ’’Count” 


Function  Count  actually  counts  a  depth,  which  may  be  positive  or  negative,  rather  than  the  number 
of  phases  required.  The  parameter  n  represents  the  current  normal  runtime  phase  —  the  number 
of  phases  before  some  arbitrary  phase  0.  Thus,  a  more  positive  n  indicates  an  earlier  phase,  and  a 
less  positive  (or  negative)  n  indicates  a  later  phase.  Thus,  this  numbering  is  the  opposite  from 
phase  numbering  used  in  previous  chapters  of  this  work.  The  reason  for  this  is  that  in  translation 
and  phase  evaluation  the  emphasis  is  on  the  number  of  phases  required,  rather  than  the  number  of 
phases  that  have  already  been  performed.  The  following  rules  define  Count. 

Count  [  id,  n  )  »  n 

An  identifier  does  not  need  any  extra  type  checking  phases. 

Count  [  b.  n  ]  -  n+1 
Count [  m,  n  ]  -  n+1 
Count  [  t,  n  ]  -  n+1 

Constants  need  only  one  extra  phase  for  type  checking. 

Count l  (funtype  el  e2),  n  ]  - 

Maximum  {  n+1 ,  Count  [el ,  n]  ,  Count[e  2,n]  } 

The  funtype  construct  itself  needs  one  extra  phase  for  type  checking,  but  the  subexpressions  may 
need  more,  so  we  take  the  maximum. 

Count[  A.  id  :  el  -+  e2  .  e3 ,  n  ]  = 

Maximum{n+2,  Count  [el,  n+1]  .  Count  [e2  ,  n+1)  ,  Count  [e3.n]) 

The  A.  construct  requires  two  extra  phases:  one  to  type  check  the  function  itself,  and  one  to  check 
the  types  of  the  domain  and  return  type  expressions.  Of  course,  the  subexpressions  may  need  more, 
so,  as  with  funtype,  we  take  the  maximum.  Also  note  that  subexpressions  el  and  e2  are 
implicitly  inside  an  eval;  hence  the  ”n+l"s. 

Count  [  (el  e2 ) .  n  ]  - 

Maximum  {  n+1 ,  Count  [el,  n]  ,  Count  [e2,n]  ) 

Function  application  itself  needs  one  extra  phase  for  type  checking,  but  the  subexpressions  may 
need  more,  so  again  we  take  the  maximum. 

Countl  (emit  e)  ,  n  ]  »  Count[  e,  n-1  1 

The  emit  construct  is  not  a  runtime  notion  at  all.  The  number  of  phases  required  just  depends  on 
the  subexpression,  but  note  that  its  normal  runtime  phase  will  one  phase  later.  Trans  will  take  that 
into  account  during  translation,  so  we  anticipate  it  here  by  subtracting  one  from  the  current  depth. 


Countl  (eval  e),  n  )  -  Countl  e,  n+1  ] 
The  inverse  of  emit. 


Translation  Rules 

Trans[  id,  n  |  *  <  id,  «  id,  art  >,  o>.  art  >  —  Identifiers 

Each  identifier  is  initially  type  art.  The  EXPR  component  is  simply  the  identifier,  hence  the 
required-environment  only  lists  this  one  identifier  of  type  art  as  the  free  variables  appearing  in  it. 

Trans[  b,  n  ]  =  <  e,  o,  art  >.  —  Boolean  constants 

where  e  €  EXPR  =  (daap-conet  b  boolean  n-l) 

A  constant  is  translated  to  an  ERT  in  which  the  EXPR  component  is  a  daap-const  expression. 
There  are  no  free  variables  in  it;  hence  the  required-environment  component  of  the  returned  ERT  is 
empty. 

Trans  [  m,  n  ]  »  <  a,  o.  art  >.  —  Number  constants 

where  a  €  EXPR  =  (daap-conat  m  number  n-l) 


Similar  to  boolean  constants. 

Trans  [  t,  n  1  *  <  a.  <>.  art  >.  —  Basic  type  constants 

where  e  €  EXPR  =  (deep-conat  t  type  n-l) 

Similar  to  boolean  constants. 

Trans [  (funtype  el  e2  )  ,  n  )  =  —  Types  of  functions 

let  <  el’,  rt ,  art  >  €  ERT  =  Trans[  el .  n  ] , 

<  e2',  r2,  art  >  €  ERT  »  Trans[  e2,  n  ] 

in  <  (chack-funtypa  el '  e2'  n-l  ),  (rl  +  r2) ,  art  > 

The  subexpressions  are  translated  to  ERT’s.  and  their  RENV  (required-environment)  and  EXPR 
components  are  combined  to  form  the  resulting  ERT.  The  expression  components  simply  become 
the  subexpressions  of  a  chack-funtypa  expression  —  they  will  evaluate  to  ERTs  in  the  first 
evaluation  phase.  The  required-environments  are  simply  concatenated  because  the  free  variables  of 
the  whole  chack-funtypa  expression  are  simply  the  free  variables  of  the  subexpressions  el '  and 
e2’.  All  variables  are  type  art  in  the  first  phase. 


Trans  [  \  id 

let 


—  Abstraction 


<  el',  rl ,  art  >  €  ERT  =  Trans[  el .  n-l  ) . 

<  e2‘,  r2,  art  >  C  ERT  =  Trans[  e2 .  n-l  ]. 

<  a3‘,  r3,  art  >  €  ERT  »  Transl  e3 .  n  ] 

< 

(chack-chack-lambda  id  el’  e2'  a3’  n-2). 
(rl  *  (r2  *  (r3  -  id))), 

art 


kit. 


Pi?? 


*/• 

< .  «»,*,  •  **  On 
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The  subexpressions  are  translated  to  ERTs,  then  combined  to  form  the  resulting  ERT. 
Subexpressions  el  and  e2.  which  give  the  function's  domain  and  range  types,  are  inside  an  implied 
•val;  hence  they  are  translated  so  that  the  function's  domain  and  range  types  will  be  computed  one 
phase  before  the  function  value  (closure)  is  computed.  The  EXPR  component  of  the  resulting  ERT 
simply  uses  the  original  bound  variable,  id,  and  the  EXPR  components  from  translating  the 
subexpressions  to  form  the  chack-check-lambda  expression.  The  required-environments  are 
combined,  but  since  id  is  a  locally  bound  variable  inside  the  body  expression  e3',  it  is  removed 
from  r3  before  being  combined  with  rl  and  r2.  However,  id  is  not  locally  bound  in  el  or  e2'  (i.e. 
el '  and  e2’  are  in  an  outer  scope)  so  it  is  not  removed  from  rl  or  r2. 


Trans[  <  el  e2  )  ,  n  ]  =  --  Function  application 

let  <  el',  rl.  art  >  €  ERT  =  Trans[  el.  n  ]. 

<  e2‘,  r2.  ert  >  6  ERT  =  Trans[  e2,  n  ) 

in  <  (check-apply  el'  e2‘  ),  (rl  r2).  ert  > 


The  subexpressions  are  translated  and  combined  to  form  the  resulting  check-apply  ERT. 


Phase  Evaluation 


We  now  define  a  function,  Pheval,  that  evaluates  a  Static-IL  expression  relative  to  some 
environment  env.  Pheval  has  the  following  type: 


Pheval:  EXPR  —  ENV  —  ECONT  —  (  EV  ♦  ERROR  1.  or  equivalently: 
Pheval:  EXPR  —  ENV  -*  (  EV  —  [  EV  ♦  ERROR  ))  —  [  EV  ♦  ERROR  ]. 


The  environment  env  is  a  function,  with  the  following  type: 


env  :  ENV  =  ID  -»  EV 


Pheval  uses  two  auxiliary  functions:  Funtype?  and  Consistent?.  Funtype?  is  used  on  TYPEs.  It  is 
true  for  functional  types,  i.e.  types  of  the  form  <fun  tl,  t2>,  for  some  types  tl  and  t2.  It  is  not 
defined  here,  but  has  the  following  type: 


Funtype?'.  TYPE  -»  BOOLEAN 


Auxiliary  function  Consistent?  checks  whether  the  types  of  identifiers  in  two  required-environments 
are  consistent  In  other  words,  for  each  identifier  and  type  <id.t>  in  the  first  required-environment. 
Consistent?  checks  every  identifier-type  pair  <id',t’>  in  the  second  required-environment,  and 
returns  falaa  if  two  identifiers  id  and  id'  match  but  their  types  t  and  t’  differ.  Otherwise  it  returns 
tru*  Consistent?  has  the  following  type: 


Consistent?:  RENV  x  RENV  -*  BOOLEAN 


Formally,  Consistent?  can  be  recursively  defined  as  follows. 
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Consistent?(  rt ,  r2  )  — 

if  rl  =  o  or  r2  =  o 

then  true 

else  let  «id1,t1>,  rl *>  €  RENV  =  rl, 

«id2,t2>,  r2‘>  €  RENV  =  r2 
in  if  idl  =  idl  and  tl  ^  t2 

then  fait* 

else  Consistent?(  rl',  r2  )  and  Consistent?(  rl ,  r2'  ) 

Phase  Evaluation  Semantic  Rules 

Pheval[  id  ](env)(k)  =  k(env(id))  —  Identifier 

The  value  of  the  identifier  is  simply  retrieved  from  the  environment.  For  at  least  the  first  phase  after 
translation,  the  identifier  is  guaranteed  to  evaluate  to  an  ERT.  In  a  subsequent  phase,  it  may 
evaluate  to  a  value  of  some  other  type. 

Pheval[  (quota  v  )  ](env)(k)  =  k(v)  —  Quoted  value 

A  quoted  value  is  simply  returned  as  is.  Quoted  values  may  be  of  various  types. 

Pheval[  (incr  e  )  ](env)(k)  =  —  Increment  (add  1) 

Phe\al[  e  ](env); 

A.  v  c  EV  .  k(v+1) 

The  subexpression  is  evaluated  (it  will  be  a  number),  and  the  resulting  value  plus  one  is  passed  on  to 
the  continuation. 

Pheval[  (chack-funtypa  el  e2  n  )  ](env)(k)  =  — For  function  type 

Phe\al[  el  ](env); 

\  <  e1\  rl,  tl  >  €  ERT  .  Pheval[  e2  ](env); 

A  <  e2\  t2,  t2  >  e  ERT  . 

if  Consistent?(  rl ,  r2  ) 

then  if  n  >  0 

then  if  tl  =  art  and  t2  =  art 

then  k(< (chack-funtypa  el'  e2'  n-1),  rl+r2.  ert>) 
else  arror-art-axpactad  —  Needed  an  ERT. 
else  if  tl  =  type  and  t2  =  type 

then  k(  <  (funtype  el’  e2'  ),  r1+r2.  type>  ) 
else  else  error-non-type  —  Needed  type, 

else  error-inconaiatent-req-enva 

Excluding  errors,  chack-funtypa  always  evaluates  to  an  ERT,  passing  it  on  to  the  expression 
continuation.  The  purpose  of  chack-funtypa  is  to  generate  a  funtype  expression  whose 
subexpressions  are  guaranteed  to  evaluate  to  TYPE'S.  In  contrast  with  funtype,  the  subexpressions 
of  chack-funtypa  evaluate  to  ERTs. 

Subexpressions  el  and  e2  are  first  evaluated;  they  evaluate  to  intermediate  ERTs.  These 
intermediate  ERTs  will  be  combined  to  form  the  resulting  ERT,  whose  expression  component  will 
either  be  a  chack-funtypa  or  a  funtype  EXPR.  The  required-environment  components  of  the 


intermediate  ERTs  must  be  consistent,  since  they  will  be  concatenated  to  form  the 
required-environment  of  the  resulting  ERT.  The  phase  depth  n  determines  whether  a 
check-funtype  or  a  funtype  expression  is  to  be  generated.  If  n>0.  we  still  have  one  or  more  phases 
to  go  before  we  should  generate  a  check-funtype  expression,  so  the  subexpressions  must  evaluate 
to  ERTs;  otherwise  (when  n=0),  we  must  generate  a  funtype  expression,  and  its  subexpressions  must 
evaluate  to  TYPEs. 


Phevall  (funtype  el  e2  )  ]  (env)  (k)  = 
Phevall  el  I  (env); 

X  tl  €  TYPE  .  Phevall  e2  ](env); 
X  t2  €  TYPE  .  k(  <fun  tl .  t2  >  ) 


— Function  type 


—  tl  is  domain;  t2  is  range. 


Excluding  errors,  funtype  always  evaluates  to  a  TYPE.  In  contrast  with  check-f  untype,  fun  type's 
subexpressions  both  evaluate  to  TYPEs.  The  final  result  will  be  a  function  type,  containing  the  types 
of  the  function's  domain  and  range,  obtained  from  evaluating  subexpressions  el  and  e2. 

Phevall  (check-check-lambda  id  el  e2  e3  n  )  ](env)(k)  = 

Phevall  el  ](env); 

X  <e1\  rl,  t1>  €  ERT  .  Pheval [  e2  ](env); 

X  <e2'.  r2.  t2>  €  ERT  .  Phevall  e3  l(env[<id,«id.ert>.o>.ert>/idl); 

X  <e3'.  r3,  t3>  €  ERT  . 
if  Consistent? (  rl .  r2  ) 

then  if  Cons/srem?(«id.ert>,<».r3j  and  Consistent?  +r2.  (r3— id) > 

then  if  t3  =  art 

then  if  n  >  0 

then  if  tf  =  ert  and  t2  -  ert 

then  let  e=  (check-check-lambda  id  el' 

e2'  e3'  n-1) 

in  k(  <  e,  (r1+r2+(r3-id)),  ert  >  ) 
else  error-non-ert  —  Needed  ERT 
else  if  tl  =  type  and  t2  =  type 

then  let  e=  (check-lambda  id  el'  e2'  e3'  ) 
in  k(  <  e,  (rt+r2+(r3-id)).  ert  >  ) 
else  else  error-non-type 

—  Not  a  type  expr 

else  error-body-ia-not-ert  —  Body  should  be  ERT 
else  error-dlfferent-type-uaed-in-body 

—  Clash  of  used/expected  types 

else  error-lnconaiatent-req-enva 

Excluding  errors,  check-check-lambda  always  evaluates  to  an  ERT  and  passes  it  on  to  the 
expression  continuation.  The  purpose  of  check-check-lambda  is  to  generate  a  check-lambda 
whose  first  two  subexpressions  are  guaranteed  to  evaluate  to  values  of  type  TYPE.  In  contrast  to 
check-lambda,  all  of  check-check-lambda's  subexpressions  evaluate  to  ERT's. 

Bound  variable  id  is  local  to  subexpression  e3  (e3  is  in  a  new  scope),  and  will  be  bound  to  an  ERT 
within  e3,  whereas  subexpressions  el  and  e2  are  considered  to  be  in  some  outer  scope.  We  first 
evaluate  el  and  e2  in  the  outer  environment,  and  then  evaluate  e3  in  an  environment  in  which  the 
bound  variable  id  is  bound  to  the  following  ERT:  <id.«id,ert>,<»,ert>.  The  expression 
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component  is  simply  the  identifier,  and  it  will  evaluate  to  an  ERT ;  hence  the  type  component  is  art. 
and  the  only  free  variable  listed  in  the  required-environment  component  is  the  identifier  itself  In 
effect,  this  declares  instances  of  id  already  appearing  in  the  body  to  be  type  ERT.  (New  instances 
may  be  introduced,  however,  when  subexpressions  evaluate  to  ERTs,  as  with  macro  expansion.) 

The  first  Consistent?  test  ensures  that  any  variables  used  in  the  domain  and  range  subexpressions  are 
the  same  types.  The  second  and  third  Consistent?  tests  ensure  that  the  body  expects  the  formal 
parameter  to  be  type  art  and  that  any  other  free  variables  appearing  in  the  body  have  the  same  types 
as  they  do  in  the  domain  and  range  subexpressions.  These  tests  are  necessary  because 
subexpressions  that  evaluate  to  ERT’s  can  introduce  new  references  to  bound  variables. 

If  the  phase  depth  n>0,  we  have  to  generate  another  chack-chack-lambda,  in  which  case  the 
domain  and  range  subexpressions  must  again  evaluate  to  ERTs;  otherwise,  we  generate  a 
check-lambda  and  the  domain  and  range  subexpressions  must  evaluate  to  TYPEs. 

Pheval[  (check-lambda  id  el  e2  e3  )  ]{env)(k)  =  —  el,  e2  will  be  TYPEs 
Phe\al[  el  l(env); 

\  tl  €  TYPE  .  Pheval[  e2  ](env); 

\  t2  €  TYPE  .  Pheval[  e3  ](env[  <t1  ,«id.t1>,<»,id>  /  id  ]); 

\  <  t3.  r3,  e3‘  >  €  ERT  . 

if  Consistent(  «id ,t1  >,<», id>,  r3  ) 
then  if  t2  =  t3 

then  k(  <  (lambda  id  .  e3'  ).  r3  -  id.  <fun  tl  t2  >  >  ) 
else  error-body-and-ranga-types-differ 
else  error-different-type-used-in-body 

—  id  has  different  type  ;  body 

Excluding  errors,  check-lambda  always  evaluates  to  an  ERT.  Its  purpose  is  to  generate  a  lambda 
expression  whose  body  expects  the  formal  parameter  to  be  the  type  declared  for  it. 

Type  subexpressions  el  and  e2  are  evaluated  to  types  tl  and  t2,  then  body  subexpression  e3  is 
evaluated  to  an  ERT  in  an  environment  that  includes  a  binding  of  the  formal  parameter  id  to  the  ERT 
<t . «id . t> . o> . id> .  In  effect,  this  declares  existing  instances  of  id  in  the  body  to  be  tvpe  t.  The 
Consistent?  test  is  used  to  verify  that  a  new  instance  of  the  formal  parameter  with  a  different  type 
has  not  been  injected  into  the  body  expression  (as  can  happen  with  macro  expansion).  Finallv.  the 
body  type  must  agree  with  the  function’s  declared  range  type. 

Phe\al[  (lambda  id  .  e  )  ](env)(k)  «  k(  <  id,  e,  env  >  )  —  Create  a  closure 

This  is  a  function  abstraction.  To  implement  lexical  scoping,  a  closure  of  the  bound  variable, 
expression  body,  and  current  environment  is  created  and  returned 
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Phevail  (check-apply  el  e2  )  )(env)(k)  a  —  el  and  e2  will  be  art’s 
Phevail  el  ](env); 

X.  <  el*,  rl.  tl  >  €  ERT  .  Phevail  e2  l(env); 

X  <  e2’,  r2.  t2  >  6  ERT  . 
if  Consistent? (  r  1 ,  r2  ) 
then  if  Funtype?(  tl  ; 

then  let  <fun  til,  tl 2  >  €  FTYPE  =  tl  —  Domain,  range 
in  if  tl 1  a  t2 

then  k(  <  (apply  el’  e2'  ).  r1+r2.  t12>  ) 

else  error-type-miamatoh 

—  Formal-actual  type  mismatch 

else  if  tl  a  art 

then  if  t2  a  art 

then  k<  <(check-appiy  el'  e2'  ).  rl+r2.  ert>) 
else  error-arg-ready-before-function 
else  error-non-funetion  —  Not  function  or  art 
else  error-inconaiatant-req-enva 


Excluding  errors,  check-apply  will  always  evaluate  to  an  ERT.  Its  purpose  is  to  generate  an  apply 
expression  that  has  been  type  checked  to  guarantee  that  the  first  argument  will  evaluate  to  a 
function,  and  the  second  argument  will  evaluate  to  the  type  declared  for  the  function's  formal 
parameter.  In  contrast  with  apply,  the  subexpressions  of  check-apply  both  evaluate  to  ERTs 


Subexpressions  el  and  e2  are  evaluated  to  ERTs,  and  the  required-environments  of  these  ERT.,  must 
be  consistent;  they  are  checked  as  in  previous  cases.  If  tl  is  a  function  type,  the  function 
subexpression  el  *  will  evaluate  to  a  function  to  be  applied  to  the  actual  parameter  in  the  next  phase; 
hence  the  type  of  the  actual  parameter  must  match  the  function's  declared  formal  parameter  t\pe. 
and  an  apply  ERT  will  be  generated.  Otherwise,  tl  should  be  art.  indicating  that  the  function 
subexpression  will  again  evaluate  to  an  ERT  during  the  next  phase.  In  this  case.  t2  should  aNo  be  art 
(indicating  that  the  argument  subexpression  will  also  evaluate  to  an  ERT).  and  another  check-apply 
will  be  generated.  At  this  point,  it  is  an  error  if  t2  isn't  art.  since  this  means  that  the  argument  is 
ready  to  evaluate  to  some  fixed.  non-ERT  value  before  the  function  expression  is  ready  to  evaluate 
to  a  function  value. 


Phe\al[  (apply  el  e2  )  ](env)(k)  =  — Funci 

Phe\al[  el  ]  (env) ; 

X  <  id.  e.  env'  >  €  CLOSURE  .  Phevail  e2  ](env); 
X  v  €  EV  .  Phevail  e  1  (env’  (v/id])  (k) 


—  Function  application 


Normal  function  application.  Subexpression  el  is  guaranteed  to  evaluate  to  a  function  closure,  and 
e2  evaluates  to  the  actual  parameter,  the  type  of  which  is  guaranteed  to  be  the  domain  ivpe  ol  the 
function. 


Proving  That  No  Runtime  Type  Errors  Are  Possible 


This  section  briefly  briefly  sketches  how  to  approach  proving  the  assertion  that  runtime  type  errors 
are  not  possible  in  Static-IL  The  more  specific  assertion  is  that  every  ERT  generated  by  this  system 
is  valid  (This  will  be  clarified  below  )  Overall,  the  proof  is  by  induction  on  the  number  of  phases 
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used  to  produce  the  ERT.  The  basis  is  zero  phases,  when  the  ERT  is  produced  directly  by  the 
Translator.  Both  the  base  case  and  the  induction  step  are,  in  turn,  proved  using  structural  induction 
on  the  original  Stauc-Phi  program  or  the  Static-IL  Expression  component  of  the  ERT 

First  off,  we  must  define  what  we  mean  by  "runtime  type  error"  in  order  to  show  that  such  errors 
are  not  possible.  The  easiest  way  to  do  this  is  probably  to  add  an  explicit  type  tag  to  the  values  that 
are  manipulated  by  Static-IL  programs,  and  then  to  define  runtime  type  errors  in  terms  of  these 
type  tags. 

Next  we  must  specifically  define  what  it  means  for  an  environment  to  satisfy  a 
Required-environment.  The  environment  must  supply  bindings  of  the  proper  types  for  all  of  the 
identifiers  listed  in  the  Required-environment. 

Now,  we  really  want  to  prose  that  every  ERT  produced  by  this  system  is  "valid",  so  we  must  define 
"valid".  Basically,  an  ERT  <e.r,t>  is  valid  if,  in  an  environment  that  satisfies  the 
Required-environment  r,  e  is  guaranteed  to  evaluate  to  a  value  of  type  t  (or  to  some  compile-time 
error  value)  without  incurring  any  runtime  type  errors. 

With  the  proper  definitions  in  order,  the  proof  would  proceed  by  induction  on  the  number  of  phases 
used  to  produce  the  ERT 

Basis.  The  basis  is  when  zero  phases  were  used  to  produce  the  ERT.  that  is.  we  must  first  show  that 
the  translator  always  produces  a  valid  ERT  This  pan  would  be  done  using  structural  induction  on 
the  onginal  Stauc-Phi  program.  The  most  important  thing  to  note  in  this  part  is  that  the  result  of  the 
Count  function  used  in  translation  is  completely  irrelevant  to  the  proof.  The  n  parameter  used  by 
several  of  the  Static-IL  constructs  to  determine  how  many  phases  to  wait,  has  no  bearing  on  the  tvpe 
correctness  of  the  system. 

Inductive  hypotheses.  Next,  we  consider  any  valid  ERT  <«,r,t>,  and  any  environment  env  that 
satisfies  the  required  environment  r.  Thus,  the  basic  inductive  hypothesis  is  that  <e.r,t>  is  valid  and 
that  the  environment  env  satisfies  the  Required-environment  r.  But  furthermore,  we  must  construct 
the  right  hypothesis  on  the  environment  to  ensure  that  no  Trojan  horse  runtime  type  errors  can 
sneak  in  through  the  environment.  Every  ERT  value  that  comes  from  the  environment  must  be 
valid,  and  every  function  that  comes  from  the  environment  must  be  assured  to  execute  without 
runtime  type  errors 

Induction.  We  must  now  prove  that  if  t=«rt  then  Pheval\«]  («nv)  (\v.v)  is  either  a  valid  ERT  or  one 
of  the  compileume  error  values.  This  would  proceed  bv  structural  induction  on  e 
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